
RY 

A 20th Century Profile 


With a Foreword by Freeman J. Dyson 

Edited By Asoke N. Mitra 


(UallDl HINDUSTAN 
IJIQJI] BOOK AGENCY 


INDIAN NATIONAL 
“ ’ SCIENCE ACADEMY 




Quantum Field Theory 

A Twentieth Century Profile 


With a Foreword by Freeman J Dyson 


Edited By 

Asoke N Mitra 

Formerly INSA-Einstein Professor 
Indian National Science Academy 


lUBIIcil HINDUSTAN INDIAN NATIONAL 

ULQJlJ BOOK AGENCY SCIENCE ACADEMY 



Published Jointly by Hindustan Book Agency (India) and 
Indian National Science Academy. 


Copyright © 2000 by Hindustan Book Agency ( India) and 
Indian National Science Academy. 

No part of the material protected by this copyright notice 
may be rqjroduced or utilized in any form or by any means, 
electronic or mechanical, including photocopying, record- 
ing or by any information storage and retrieval system, 
without written permission from the copyright owner, who 
also has the sole right to grant licences for translation into 
other languages and publication thereof. 

All export rights for this edition vest exclusively with 
Hindustan Book Agency (India). Unauthorized export is a 
violation of Copyright Law and is subject to legal action. 

INSA Editor of Publications: 

Professor S. K. Malik 

INSA Editorial Staff: 

J. Saketharaman, AES-I 
Rajan Phull, SOI 


ISBN 81-85931-25-9 


Printed by Chaman Enterprises, Delhi 



DEDICATION 


The BOOK is dedicated to three great Teachers: 

Jatindranath Mitra who introduced his son to Mathematical Physics; 

Hans Bethe who showed the dual virtues of simplicity and thoroxighness in Physics; 

Freeman Dyson who gave a glimpse of the beauty that Physics exudes 
when clothed in the language of Mathematics, 




INSA PRESIDENT’S NOTE 


To mark the International Mathematics Year 2000, the Indian National Science Academy had decided 
to publish three specialized publications on the following subjects: Number Theory, Non-linear 
Phenomenon and Quantum Field Theory. The Academy had invited one of its Fellows, 
Dr. Asoke N. Mitra, former Einstein Professor of Indian National Science Academy, to prepare a 
monograph on a suitable theme and solicit articles from internationally acknowledged experts. The 
result of his endeavour is a comprehensive volume on Quantum Field Theory -A Twentieth Century 
Profile. The foreword to this volume is by Freeman Dyson, one of the Founding Fathers of modern 
Quantum Field Theory. 

The Academy thanks Dr. Mitra for his efforts and all the distinguished contributors to the book for their 
ready response. 

I very much hope that the present monograph will be a valuable addition to the literature on Quantum 
Field Theory and will be appreciated by the Mathematical Physics Community at large. 


New Delhi Goverdhan Mehta 

8th February 2000 President, INSA 




Foreword 


Freeman Dyson * 

Institute for Advanced Study, Princeton, New Jersey, USA 


The Impact of Field Theory on Physics of the Twentieth Century 

I am proud to count Dr. Asoke Mitra, the editor of this volume, among my 
students. After serving his apprenticeship as a field theorist at Cornell University 
in the USA, he chose to return home to help build up science in India. The choice 
was not easy, at a time when science in America was flourishing and rapidly pushing 
ahead, while science in India was struggling to overcome the obstacles imposed by 
geography and history. Dr. Mitra sacrificed his chance of a brilliant research career 
in America, in order to serve his country and his people. I deeply respect that 
choice, and I rejoice that his sacrifice was not made in vain. After a fruitful career 
as a pioneer and teacher of modern science in India, he now stands at the center of 
the vibrant scientific community that he helped to create. This volume is, among 
other things, a monument to his vision. 

Quantum field theory was a good subject, and remains a good subject, for culti- 
vation in India. Homi Bhabha was one of its early practitioners. It is firmly rooted 
in experimental science, but it can be cultivated without elaborate apparatus. It 
is well suited to a country with limited material resources and an ample supply 
of talented people. Indian field-theorists are well placed to play a leading role in 
bringing modern science to Asia. Anywhere in the world, a group of quantum field 
theorists can make serious contributions to science, if they have a modest computer 
with connections to the internet and the world wide web. Especially in the last ten 
years, since the rapid development of string theory brought new excitement to the 
field, quantum field theory has become a world-wide enterprise. Everywhere you go, 
wherever there are scientists and students, you find string-theorists. 

When I was growing up sixty years ago, I used to read books by the writer 
Peter Fleming describing his travels in remote places. He wrote a book, “News from 
Tartary”, about his travels in Central Asia. At that time the city of Urumchi, in 
the mountains of north-west China, was the most inaccessible place on earth. All 
the trails to Urumchi were blocked by rival war-lords and bandits. Fleming tried 
for several years to reach Urumchi but never succeeded. I thought of Urumchi then 
as a legendary place, for ever beyond the reach of civilization. I was not sure that 
it really existed. Now, sixty years later, I received a post-card from Urumchi, sent 
by Andrew Strominger, an American string-theorist who happened to be visiting 


*Eraail: dyson@ans.ias.edu 



Vlll 


there. The post-card has a picture of magnificent snow-covered mountains on the 
front On the back Strominger wrote, “There is a lively group of stiing-theoiists 
here”. Urumchi is now a thriving industrial city, called Wulumuchi by the Clnnes(>. 
Quantum field theory penetrates all barriers. 

Quantum field theory was invented by Europeans in the nineteen-twenties, very 
soon after they had invented quantum mechanics. The first quantum field tlu'ory 
was the quantum electrodynamics of Dirac, describing the emission and absorption 
of electromagnetic radiation by atoms. At that stage, only the Maxwell field was 
quantized. Dirac’s theory was immediately successful in explaining the obseiaani 
behavior of radiation. It agreed with all known experiments in atomic physic s, and 
with Planck’s law of black-body radiation. Already in 1927 it was clear that <[uant uin 
electrodynamics was basically correct. Within the next two years, Dirac’s theory was 
extended and improved by Fermi, Heisenberg, Pauli, Jordan and Wigner, so that it 
included all the known particles and interactions. The framework of ({uantum field 
theory was then complete. It appeared to give a correct description of all physical 
phenomena with the exception of gravitation. 

Confidence in quantum field theory rested on three foundations. First, the jiart 
of it that concerned electromagnetic processes was confirmed by accurate exp(‘ri- 
ments. Second, the theory as a whole was mathematically elegant, unifying in a 
beautiful way the principles of special relativity and quantum mechanics. 'Fhird, 
the theory explained the most striking fact about elementary particles, the fa<'t that 
all particles belong to a small number of species, with the particles within <‘ach 
species indistinguishable. All electrons are indistinguishable, with the same mass 
and charge, because they are all embodiments of the same field. A fi'w years laf-er 
the positron was discovered, with the same mass as the electron but with ecpial and 
opposite charge. This was final confirmation that matter is built out of (piantum 
fields, each charged field giving rise to particles and anti-particles linked by a <leep 
underlying symmetry. Quantum field theory explained the existence of anti-particics 
and also explained the symmetry. 

The basic framework of quantum field theory has remained the same from the 
nineteen-twenties until the present. But the details of the theory hav(^ changf'tl 
as new kinds of fields and particles have been discovered. The most important 
change was the introduction of new gauge fields, in addition to th(^ gaug<> fi(‘Id of 
electrodynamics, to describe the weak and strong interactions. The Weinlxng-.Salam 
model of the weak interactions introduced three gauge fields, each associat(‘d with 
a massive new species of particle. When the W and Z particles predicted l)y the 
model were discovered, nobody could any longer doubt that quantum fudd tlieory 
was correct. A few years later, quantum chromodynamics provided a mod(d of f,h(‘ 
strong interactions within the same framework. The new gauge fields, incorjxjratiug 
the weak and strong interactions, give rise to the rich diversity of phenoimuia tliai 
are seen in high-energy experiments. 

All through its history, quantum field theory has had two faces, oiu; looking 
outward, the other looking inward. The outward face looks at nature and gives us 
numbers that we can calculate and compare with experiment. The inward face looks 



IX 


at mathematical concepts and searches for a consistent foundation on which to build 
the theory. The outward face shows us a brilliantly successful theory, bringing order 
to the chaos of particle interactions, predicting experimental results with astonishing 
precision. The inward face shows us a deep mystery. After seventy years of searching, 
we have found no consistent mathematical basis for the theory. When we try to 
impose the rigorous standards of pure mathematics, the theory becomes undefined 
or inconsistent. Prom the point of view of a pure mathematician, the theory does 
not exist. This is the great unresolved paradox of quantum field theory. 

To resolve the paradox, during the last twenty years, quantum field theorists 
have become string-theorists. String theory is a new version of quantum field theory, 
exploring the mathematical foundations more deeply and entering a new world of 
multidimensional geometry. String theory also brings gravitation into the picture, 
and thereby unifies quantum field theory with general relativity. String theory has 
already led to important advances in pure mathematics. It has not yet led to any 
physical predictions that can be tested by experiment. We do not know whether 
string theory is a true description of nature. All we know is that it is a rich treasure 
of new mathematics, with an enticing promise of new physics. During the coming 
century, string theory will be intensively developed, and, if we are lucky, tested by 
experiment. We can rejoice that Indian scientists will be active participants in this 
adventure. 




Preface 

On the eve of the International Mathematics Year 2000, the Indian National Science Academy had 
decided to bring out some selected volumes on different facets of Physics and Mathematics. The 
basic idea was somewhat on the lines of the American Physical Society’s plans to bring out its 
souvenir Volume ’’More Things On Heaven And Earth”, at the end of the Millenium (which also 
coincided with its Centenary Year). However the Academy’s plans were much more modest, being 
limited to a few topical aspects of these subjects within the scope of its journal specialization. 
The present volume in particular aims to highlight the impact of Field Theory on the evolution of 
Physics of the Twentieth Century. The choice of Field Theory as a central theme for this particular 
Volume was dictated by the consideration that, as the single most important concept in Physics 
to be discovered in this century, it would register a commanding presence on a vast array of topics 
which would bear standing testimony to the success story of this unique armour in the arsenal of 
Physics. We were indeed fortunate to receive an endorsement of the QFT theme from Freeman 
Dyson, one of the architects of modern Field Theory, in a Foreword to this Volume, under the 
title ” The Impact of Field Theory on Physics of the Twentieth Century” in which he has extolled 
the universal appeal of Field Theory in Physics. This theme is projected in this Book through a 
selection of articles (by acknowledged experts) in those areas where the impact of QFT has been 
especially pronounced, from particle physics to string theory (with several interpolating stages of 
development), and extending to some facets of astrophysics and the physics of condensed matter. 

The emphasis in the Book is mainly on quantum field theory (QFT), so the standard Einstein’s 
(geometric) theory of gravitation (whose quantum formulation is still a distant goal), is not in its 
direct purview. Of course the String Theory route for evolution of QFT (which holds the key to 
a potential understanding of gravitation) has been included, but not in a specialized enough form 
to do proper justice to gravitation. The actual contents of the Book are of necessity governed by 
the access to acknowledged experts in their respective areas of expertise within a relatively short 
span of time, yet the response has been quite encouraging. 

In the area of particle physics the emphasis is mainly on symmetries, topologies, gauge theories 
and renormalization groups. While electroweak interactions have been treated with standard 
rigour, the strong interaction sector has needed greater filtration, so as to conform to the basically 
QFT-oriented thrust of the Book. 

A distinct feature of this Book (not usual for an edited book) is that its theme has been 
highlighted through a comprehensive Editorial Summary of all the 33 articles, preceding their 
classified presentation in six distinct parts: 

A) basic structure of QFT; B) topological aspects of QFT; C) miscellaneous formal methods in 
QFT; 

D) extension of QFT frontiers; E) QFT in 2 + 1 dimensions; and F) strong interaction methods in 
QFT/QCD. 

The contributors range from veterans like (the late) Vladimir Gribov, Marcos Moshinsky, Kazuhiko 
Nishijima, John Schwarz, Dmitri Shirkov and Edward Witten, to a string of acknowledged experts 
in their respective fields of expertise, all the way to a few young and promising workers. The Book 
concludes with a modern perspective on the observable limitations of Quantum Field Theory. 



The Book contains two Articles not directly written by the authors concerned in res[)oiise to 
the Academy’s invitation, but nevertheless central to the QFT theme. Tin' first, is a s('t of three 
(unpublished) Orsay Lectures on Confinement by the late Vladimir Gribov during 19!)2d)3 (before 
his death), as compiled by his associates Profs. Dokhshitzer, Ewarz and and especially .lulia .N'yiri 
(Mrs. Gribov). We are grateful to all of them for permission to reproduce these notes in this 
Volume, especially because the Academy has nostalgic memories of Prof Gribov’s close association 
with the Academy during the Sixties. We are particularly indebted to Prof. Olivier I’eiie I'oi his 
invaluable help in making these Lectures available to us in the form of LPTHE preju ints ami 
hep-ph documents. The second Article is an outstanding itaper entitled ’’Qmuitum Field 'liieory 
and the Jones Polynomial” by Edward Witten, originally published in Conunun.Math.Phy.s.l21, 
351-399 (1989). We are extremely grateful to Prof. Witten as well as the Publishi'is (Sptinger 
Verlag, Germany) for permission to reproduce this jraper in the prestmt V'oluine in exactly the 
same format in which it had appetired in the original Jounuil. Both the Gribov and tlm Witten 
Articles appear in Part A of the Book. 

The wide range of topics covered makes the Book more than jttst an introductory text book on 
QFT. It is recommended as a reference book for a broad spectrum of rtsidtuship, from fresh postrlucs 
in most key areas of QFT to the specialists in evolving areas. And th(' freehuice r(‘.seaichei in Qk’ I 
should find enough ’’appetizers” among the contents to kindle his inttuest in the field. However 
while no efforts have been spared to maintain a basically field- thtionttic f.t!xt,ure for the diverse 
topics covered in the Book, no uniformity can be clairnerl for tJudr mat.hemiiti('al standards of 
presentation, reflecting as they do the variations in the stab! of the tut in the respective fields, 

The Academy is grateful to all the contributors for their timely coopt'rtiiion with t heir respect ivt> 
Articles. Above all, the Editor is deeply indebted to his Teacher and Mentor, Freeman Dy.son, for 
readily consenting to give a Foreword to this Volume which is the first such venture of thi‘ Indian 
National Science Academy. 

The Book has been composed directly under the auspiccw of the Acathmiy und<T the orders of 
the President, Prof. Goverdhan Mehta. We, are grateful to Prof. M<ihta for his personal inteiest 
in this venture, as well as to Prof S.K. Malik (Editor of the Acadt'iny) fur his unstinted help in 
getting the process going. And thanks to the active iutertist of tlu! I'ixts'utivt' Secretary (.\Ii, 
S.K.Sahni), the editorial staff, especially Mr. Saketaraman and Mr. Rajan i’hull in the final 
stages of production. We are especially grateful to Mr. Jayan without whose deep invtdvemenl, 
the crucial LaTeX processing of the Book would not havt* l)e(!n possible. 'I'hanks are also due in 
Mr. Santosh Malik, for editorial help. The technical advice of Dr.Vineet Ghildyal (of tin' Delhi 
University Library System) in the specialized fuspects of composition in LaTex format, has been 
truly invaluable, and is most gratefully acknowledgtid. 


Asoke N Mitra 

244 Tagore Park, Delhi - 110 009 
30 November 1999 



CONTENTS 

Foreword by Freeman Dyson vii 

Preface xi 

Editor’s Summary : Dimensions Of Field Theory ~ Prom Particles To Strings 3-21 

by A.N.Mitra 

Part A : Basic Structure Of QFT 22 

1. D. V.Shirkov : Evolution Of The Bogoliubov Renormalization Group 25-58 

2. S.Szpigel and R.J. Perry : The Similarity Renormalization Group 59-81 

3. V, Novikov : Quantum Field Theory And The Standard Model - Bird’s Eye View 82-113 

4. P.K.Kabir : Broken Reflection Symmetries 114-121 

5. D.Boyanovsky and H.J.de Vega : Dynamics Of Symmetry Breaking Out Of 122-157 

Equilibrium - From Condensed Matter To QCD And The Early Universe 

6. Vladimir. N. Gribov (Late) : Orsay Lectures On Confinement 

(I) hep-ph/9403218 158-180 

(II) hep-ph/9404332 181-187 

(III) hep-ph/9905285 188-199 

7. K. Nishijima and M. Chaichian : An Essay On Color Confinement 200-207 

Part B : Topological Aspects Of QFT 209 

8. R.Kaul : Topological Quantum Field Theories - A Meeting Ground For 211-232 

Physicists and Mathematicians 

9. Edward Witten : Quantum Field Theory And The Jones Polynomial 233-281 

10. H.Banerjee : Chiral Anomalies In Field Theories 282-296 

11. Wei'Min Zhang : Coherent States In Field Theory 297-323 

12. N.Mukunda : Pancharatnam, Bargmann And Berry Phases - A Retrospective 324-336 

13. J.Schechter and H. Weigel : The Skyrme Model For Baryons 337-369 

Part C : Formal Methods In QFT 371 

14. R.Ramanathan : Euclidean Methods In Quantum Field Theory 373-382 

15. Ashok Das : Topics In Finite Temperature Field Theory 383-411 

16. B.M.Sodermark : Integrable Models And The Toda Lattice Hierarchy 412-436 



17. Prem P Srivastava : Perspectives Of Light-Front Quantized Field 
Theory -Some New Results 

18. D.S.Kulshreshtha : Gauge Symmetry In Chiral Electrodynamics 

19. L.Lusanna : Towards A Unified Description Of The Four Interactions In Terms 
Of Dirac-Bergmann Observables 

Part D : Extension Of QFT Frontiers 

20. R.N.Mohapatra : Supersymmetry And Particle Physics 

21. N. Sakai : Supersymmetry In Field Theory 

22. Wemer Nahm : Conformal Field Theory: A Bridge Over IVoubled Waters 

23. John H Schwarz : Superstring Theory - An Overview 

24. J.Maharana : Recent Developments In String Theory 

25. L.Bonora : Yang-Mills Theory And Matrix String Theory 

Part E : QFT In 2 H- 1 Dimensions 

26. Avinash Khare : Fractional Statistics And Chern-Simons Field Theory 
In 2 -f 1 Dimensions 

27. R.Rajaraman : Chern Simons Field And Composite Bosons In The 
Quantum Hall System 

Part F : Methods Of Strong Interactions In QFT 

28. Olivier Pene : Hadrons From QCD - Achievements And Prospr'cts 

29. L.S.Kisslinger : QCD Sum Rules In Hadronic And Nuch'ar Physics 

30. V. A. Karmanov : Light- Front Dynamics 

31. A.N.Mitra : 3D-4D Interlinkage Of B-S Amplitudes - Unified 
View Of QQ and QQQ Dynamics 

32. Marcos Moshinsky : The Harmonic Oscillator In Quantum Theory - A Powerful 
Bridge In Physics 

Conclusion 

33. D.Home : Modern Perspectives On Foundations Of Quantum Mechanics 


437 478 

479 489 
490 518 

519 

521 543 
544 570 
571 (i()4 
(105 610 
(111 «fi4 
()65 (.99 

701 

7(12 729 

730 741 

743 

745 753 
754 794 
795 825 
82(; 856 

857 874 

875 
877 900 


2 



The Dimensions Of Field Theory — From Particles 

To Strings * 

A.N.Mitra ^ 


Abstract 

This is an editorial summary of the contents of a Book comprising a set of Articles by ac- 
knowledged experts dealing with the impact of Field Theory on major areas of physics (from 
elementary particles through condensed matter to strings), arranged subjectwise under six 
broad heads. The Book which emphasizes the conceptual, logical and formal aspects of the 
state of the art in these respective fields, carries a Foreword by Freeman Dyson, and is to 
be published by the Indian National Science Academy on the occasion of the International 
Mathematical Year 2000. The authors and full titles of all the Articles (33) are listed sequen- 
tially (in the order of their first appearance in the narration) under the bibliography at the 
end of this Summary, while a few of the individual articles to appear in the Book are already 
available on the LANL internet. 


1 Birth, Decline And Rebirth Of Field Theory 

If one must choose one single item of Twentieth Century Physics which stands out by the yardstick 
of most pervasive and decisive influence on its total development, Quantum Field Theory (QFT) 
certainly wins hands down. Historically, QFT was born out of the marriage of Relativity and 
Quantum Theory, at a hefty price of mathematical self-consistency underlying the celebrated Dirac 
Theory, whose full significance took several stages to unfold, through the vissicitudes of logical 
deduction, greater significance from the conceptual point of view, was the realization that the 
”sea of negative energy states” was already a tacit admission of the failure of relativistic quantum 
mechanics of a single particle, in favour of a collective many-particle, or field description, a fact 
which was to be driven home by Dyson in his Cornell lectures of 1952. And once this realization 
dawned on the pioneers, the Klein-Gordon theory of scalar particles found a natural place in the 
new scenario, at the hands of Pauli-Weisskopf(1934) who now found little difficulty in quantizing 
these bosonic particles just as easily as the Dirac theory had done to fermions. Thus was born 
’’Quantum Field Theory” (QFT) in its full glory, with Anti-matter playing a symmetrical role to 
Matter, irrespective of its fermionic or bosonic nature. [ Feynman’s brilliant positron theory was a 
bold attempt to resurrect the single particle quantum mechanics description via ’’zigzag” diagrams 
(negative time propagation of negative energy electrons), but the more universal language of Field 
Theory eventually carried the day] . 

QFT registered its first major success in the Covariant formulation of QED at the hands 
of Tomonaga and Schwinger on the one hand, and Feynman on the other, with Dyson playing 
the catalyst-role in synthesizing the two. This theory, in the course of circumventing unphysical 
infinities in the measurable quantities, gave rise to a new dogma of Renormalizability which 
was to act as the yardstick of acceptability of theories to come. This dogma, together with the 
independent principle of ’’Gauge Invariance” (already in-built in QED), were to be two pillars 
of QFT in its march towards greater victories to come, especially in the formulation of strong 
interaction theories on analogous lines to QED. This led to the Yang-Mills theory (1954) of 517(2) 

■"Editorial Summary : hep-ph/9911450 

tEmail: (1) ganmitra@nde.vsnl.net. in ; (2) anmitra@csec.ernet.in 



6 


/It N Mitra 


put both aspects together and derived the ’’i?G-equations” in a form which l)ring.s out tJic scal- 
ing’ properties of the electron and photon propagators. Thus RG invariance boils down to the 
invariance of a solution w.r.t. the manner of its parainetriKation. Tlu'sti ('quations w(>re furtlnu 
developed and made more rigorous with mathematicians and physicists woiking in taiuhun, so th.it 
renormalization became a well-developed method at the computational level. But tlie umlerlying 
physical concepts behind these equations took some more time to unfold until after Kadanov's, 
and especially Wilson’s pioneering work on the understanding of tlui ”(;ritical indices" in phase 
transitions brought out the real physics behind the RG equations. 

Wilson’s work revealed the rich applicational potential of tlie RG ideas in various fields of 
physics, from ‘critical phenomena’ (spin lattices, polymer theory, turbulence) in conden.sed mat ter 
physics, to QCD parameters like the strong coupling constant a„ and the ‘running mass' ni(/r ). In 
particular, the discovery of Asymptotic freedom in QCD allowt'd physicists to prodms* a logically 
consistent picture of renormalization, one in which the perturbative^ expansions at any high energy 
scale can be matched with one another, without any need to deal with interme<iiate expansions in 
powers of a large coupling constant. Another important aspect of those RG etpiat ions which has 
been emphasized by the Dubna School, is the concept of functiorial self-similnfiti/ in mathematical 
physics, which has led to applications like the study of strong non-linear regimes; asymiUotic 
behaviour of systems described by non-linear partial differential equations; problem of generating 
higher harmonics in plasmas, and so on. The Book begins with a perspective Article Iiy Dmitri 
Shirkov[l] on all aspects of the subject, from an introduction to RG in QFT to sin overview of its 
methodology, together with applications of RG ideas in some important aremis of [rhysics. 

A relatively new approach to RG theory, termed ’’Similarity Renortmilization droup" (HHd) 
was launched in this decade by Wilson and Ghizok, as well as Wegner, and is htised on the ix-reeptioa 
that divergences are in the first place clue to tholocality of the- primary iut(*ractions. For ii luoirer 
understanding of the features of the SRG thciory, it is enough to consider only the aon-rcdativist ic 
quantum mechanics (the usual UV divergences of relativistic QFT arc- not leh'van! here!), where 
the locality condition on the potentials at all scales corresponds to taking only delta functiims and 
their derivatives. The associated divergence's can be regulatc'd by introducung cut-offs whose effects 
may be removed by renormalization. 

In the SRG, the transformations that explicitly ’’run” the ent-off parametiT an* developed. 
These similarity transformations are of course unitary, and constitute the group (•h'ments (*f SH<J 
They are characterized by a ’’running” cut-off on energy differcuices (not stat(?.s). If the Hamiltonian 
is viewed as a large matrix, these cut-offs limit the off-diagonal matrix elements, ami as they 
axe gradually reduced, the Hamiltonian is forced towards the ciiagonal form. The perturbation 
expansion of the transformed Hamiltonians contains no small energy denominators, so that the 
expansion does not break down unless the strengths of the interactions themHeIve,s are large. With 
the help of an associated concept of coupling coherence, SRG acquire.s respectability as a prt>per 
theory with the same number of parameters as the original (fundamental) theory. A review of tin* 
formalism and working of SRG is given by R J Perry[2], using as an example t he exactly soluble 
case of a simple 2D delta function to act as a laboratory for touting the mnvvrgvncv of the SRG 
method in some detail. 

2.2 Standard Model And Electroweak Coupling 

The Gauge Principle, as a central ingredient of QFT, needed to be .supplement ‘d with fre.sh ideas 
and paradigms, within its broad framework, to extend its tentacles further. One such itiea w;us 
based on the degenerate structure of the vacuum, dominated by vales and hills, whi<*h crystalliz<'d 
eventually as a new theme termed ’’Spontaneous Symmetry Breaking” (SSB), together with it.s 
companion ’’Dynamical Breaking of Chiral Symmetry” (DBxS), whicb would now enafjle gauge 
fields to acquire mass in a subtle but self-consistent manner. Armed with this paradigm, tlu; Gauge 
Theory registered a signal success in the Weak interaction sector, culminating in the Ghtshow- 
Salam- Weinberg (GSW) Model of Electro-weak Interactions, which offered a Jinified view of weak 
and electromagnetic interactions in the form of an 517(2) (gi 17(1) gauge theory. A more ambitious 
form of unification of the three principal gauge fields as a straightforward extension of the GSW, .so 



Editorial Summary 


7 


as also to include the strong (GUT), did not unfortunately bear fruit, so that, for the time being, 
the "Standard Model” (SM) has had to rest content with only a partial unification SU(3)<S>SU{2)0 
U{1) of these gauge fields. Nevertheless this episode brings out a truism about the unpredictability 
of Nature, viz., its refusal to yield to a particular strategy for a second time, merely on the strength 
of its success on a previous occasion. • 

In a highly instructive and self-contained Article, V Novikov [3] gives a panoramic view of 
the conceptual and methodological framework of QFT (with the ingredients of gauge principle, 
renormalization group, and spontaneous symmetry breaking) that have been employed in the 
formulation of SM for elementary particle physics. He dwells in particular on the Higgs mechanism 
for the generation of the fermion masses for several generations, and brings out the powers of "loop 
corrections” in SM to predict accurate bounds on the masses of as yet undiscovered particles. This 
is vividly illustrated by the "correct” mass of the ^(op)-quark ahead of its experimental discovery, 
stringent limits on the Higgs mass from the "Landau pole" structure of the running coupling 
constant, and the windows to the "physics beyond S'M" that such analyses provide. 

2.2.1 Discrete Symmetries in SM 

An essential aspect of the Standard Model concerns the role of discrete symmetries P, C, T in 
determining the structure of the electroweak coupling. This subject has had a long history since 
the original Lee- Yang discovery of P- violation, going through successive phases of chiral symmetry 
(Landau-Salam), CP invariance (Lee-Oehme-Yang), its subsequent violation (Cronin-Fitch), and 
ipso facto (?) T- violation, a topic of intense experimental activity today. [This last is of course an 
immediate consequence of TCP-invariance (Pauli-Lueders Theorem), which puts the existence of 
antiparticles exactly on par with particles!. A brief state- of-the-art review of the subject by P K 
Kabir [4] follows. 

2.3 Dynamics Of Symmetry Breaking 

Just as "Symmetry dictates interactions”- (C.N.Yang at the First Asia Pacific Conf, Singapore, 
1983), the dynamical effects of its breaking (whether spontaneously or dynamically) during out-of- 
equilibrium phase transitions is equally at the root of a whole range of phenomena from condensed 
matter to particle physics, and so on, all the way to early universe cosmology. Indeed the dynamics 
of non-equilibrium phase transitions and the orderingprocess that occurs until the system reaches 
a broken symmetry equilibrium stage, have developed in tandem with controlled experimental 
techniques in many areas of condensed matter physics (binary fluids, ferromagnets, superfluids, 
liquid crystals), so as to provide a solid basis for describing the dynamics of phase ordering. In 
cosmology, measurements of Cosmic Microwave Background anisotropies, and the formation of 
large scale structures in the Universe, provide signatures for phase transitions during and after 
inflation. And at the accelerator energies (Brookhaven-RHIC or CERN-LHC), phase transitions 
predicted by QCD could occur out of equilibrium via pion condensates. 

In an instructive review on this subject, Boyanovsky and de Vega [5] describe the relevant 
aspects of the dynamics of symmetry breaking in many areas of physics (from condensed matter 
to cosmology) vis-a-vis possible experimental signatures. In condensed matter, they address the 
dynamics of phase ordering, emergence of condensates, and dynamical scaling. In QCD, the 
possibility of disoriented chiral pion condensates arising from out-of-equilibrium phase transitions 
is considered. And in the early Universe, the dynamics of phase ordering in phase transitions, 
is described, especially the emergence of condensates and scaling in Friedman-Robertson- Walker 
cosmologies, within a QFT framework. 

2.4 Confinement: Supercharged Nucleus 

With the failure of GUT theories to take care of the strong interaction sector SU (3) of the Standard 
Model, the central issue of Confinement, which has had a long history .of approaches ranging from 
the fundamental to effective types, still remains an unsolved problem. There is a vast literature on 



8 


A N Mitai 


the subject, from Lattice QCD to various analytical methods for non-perturbative QCD. Of these', 
2 novel approaches to Confinement, which are fairly self-contained, and stand out from th(> more 
conventional ones, are included in Part A, leaving the rest for Part F. The first concerns an analogy 
to a super-charged nucleus, based on an old work of Pomeranchuk and Smorodinsky (1940), which 
offers the possibility of binding a particle in a small region of space. This method was .'laboratiMl 
in a set of THREE ’’Orsay Lectures” by the late Vladimir Gribov [6] during 1992-94. 'Phe basic 
idea is that if the charge Z in a nucleus Nz is larger than a critical value Zc w 180, then this 
nucleus will decay to an atom of charge 2 - 1 and a positron: Nz Az~\ + <' • product, 

nucleus is unstable, the process gets repeated until the total charge of the final product, is so small 
that further decay is impossible. Such a supercharged nucleus (a ‘resonance’) cannot exist fiuely, 
but only inside an atom, hence is reminiscent of a ‘confined’ state ! The region of stal>ility of such 
a ‘superbound’ atomic state, (mainly due to the Pauli principle), works out as ro <<. r s. l/m, 
where ro is the radius of the nucleus, and m the electron mass. In these three lectuu's, which 
are reproduced in this Book through the courtesy of his long term Associates Dokhshitzer. Hw;ir/, 
and Nyiri, Gribov [6] gives a leisurely exposition of the detailed working of this mechanism on 
the confinement of heavy, followed by light, quarks. These ideas have since been (‘xtended by 
the Dokhshitzer Group in their subsequent publications hep-ph/9807224 and hep-ph/99()227‘), Init 
these are outside the scope of this Book. 

2.5 Confinement: BRST Mechanism 

The second approach concerns a perspective on confinement due to Nislujirna who its 

mechanism to that of an unbroken non-abelian gauge symmetry in QCD. Th(^ Iogi(* of this metliod 
which was mostly pioneered by Nishijima, may be illustrated for the casc^ of abelian QKI) m follows. 
Quantization of of the e.m. field requires ’’gauge-fixing”, say by a covariant (Fermi) gauge. Tim 
in turn requires introduction of the indefinite (Gupta-Bleuler) metric which, for the seleriion ui 
physically observable states, must be eliminated by imposing the Lorentz condition on the staff* 
vector. There are now 4 kinds of photons (2 transverse, 1 longitudinal, and I scalar), of which 
the two ‘scalar’ photons must have negative norms, so as to ensure manifest covariance* of the 
quantization in the Minkowski space. 

Now to project out the physical subspace, one introduces a subsidiary (Lorentz) ciUKiition (a 
4-divergence of a vector field) which represents a free, massless field even under intenmtions. The 
photons involved in this operator (called a-photons) are special combinations of longitudinal and 
scalar photons with zero norm, A second (orthogonal) combination (called b-phoiom) also cari \n* 
arranged to have zero norm. However the inner product of a- and 6- phototis is noii-zero; t!u*y 
are ‘metric partners’ (somewhat akin to the 4- vectors n/mu, defining a covariant rinlLidane: 

= 0; n.n = 1). A physical state is defined as one that is annihilated by ap|>lying the 
positive frequency part of the Lorentz condition. And since the S-matrix in QED commut<\s wilti 
this 4-divergence, it transforms physical states into one another, withotit letting thturi out of this 
subspace which now includes only t (transverse) and a-photons, but not b-pliotons. Howiwer thf* 
inner product of a physical state with one a-photon, with another physical state (witti or without 
an a-photon), vanishes identically. Thus a-photons give no contribution to obsc*rval)le <puintities, 
and both a- and 6-photons escape detection ! This is called confinement of longitudinal and sc'alar 
photons in QED, a kinematical phenomenon ! 

In QCD, on the other hand, not only a- and 6-gluon8, but also the t-gluons are unoixservabhs 
giving a dynamical orientation to the confinement mechanism. While the basic logic and .signature 
of confinement for non-abelian QCD remains the same as above for abelian QED, some extra 
ingredients of a highly technical nature are needed to bridge the gap. For not only the obsc^rvable 
quantities now depend on the gauge parameter, but the 4-diverg6nce of the gauge field is no longtu* 
a free field ! To eliminate the gauge-dependence of physical entities, Faddeev- Popov proposed to 
average the path integral over the manifold of gauge transformations, resulting in a nc^w term in the 
Lagrangian (Faddeev-Popop ghost), involving a pair of anticommuting saclar fitdds whost* violation 
of the Pauli theorem on spin-statistics connection inquires introduction of thc^ indefinite metri<% 
as in QED. However, the operator analog of the Lorentz condition is more tricky in this c’ase. It 



Editorial Summary 


9 


is facilitated by a novel symmetry found by Becchi-Rouet-Stora (BRS) which was originally used 
for renormalizing QCD. Nishijima successfully exploited this symmetry to construct the requisite 
operator, and obtained a formal proof of confinement in the QCD case, as an extension of the logic 
employed for QED. A qualitative sketch of this proof appears in the Article by K Nishijima and 
M. chaichian [7]. 

3 Field Theory: Topological Aspects 

An important sector of QFT that has come to occupy increasing importance in the last two decades, 
concerns its Topological aspects, as a powerful tool to probe the geometry and topology in low 
dimensions. This illustrates rather vividly the coming together of physicists and mathematicians, 
this time in building powerful links between quantum theory (through its path integral formulation) 
on the one hand, and the geometry and topology of low dimensional manifolds on the other. Indeed 
it appears that the properties of low dimensional manifolds can be nicely unravelled by relating 
them to infinite dimensional field manifolds, thus providing a powerful tool for studying these 
manifolds. 

A unique characteristic of topological field theories is their independence of the metric of curved 
manifolds on which they are defined. This makes the expectation value of the energy-momentum 
tensor vanish. Since the only degrees of freedom are topological, there are no local propagating 
degrees of freedom. The operators are also metric independent. These features are addressed in 
some detail in a self-contained introductory Article by Romesh Kaul [8] on topological QFT 
regarded as a meeting ground for physicists and mathematicians. 

3.1 CS Theory And Jones Polynomials 

Quantum YM theories in (2-1- 1)D provide a field theoretic framework for the study of "knots and 
links" in a given 3-manifold, and illustrate the interplay of QFT and the topology of low dimensional 
manifolds. A striking result of this connection is that the famous "Jones Polynomials” of knot 
theory can be understood in 3D terms. This result was formally demonstrated by Edward Witten 
about a decade ago in a paper entitled "Quantum Field Theory And The Jones Polynomial”, thus 
fulfilling a long-cherished goal of an exact (non-perturbative) solution of a gauge field theory, for 
the first time in 3 dimensions. Witten showed that the "Jones polynomial can be generalized from 
5^ to arbitrary 3-manifolds, giving invariants that are computable from a surgery presentation”. 
Witten further showed that these results shed new light on 2D conformal field theory. In view of 
the historical importance of this pioneering work in the context of this Book theme, we reproduce 
(with permission from Springer- Verlag) the celebrated Witten paper [9] (which had appeared in 
Commun. Math. Phys. 121 (1989) 351-399), in full. 

3.2 Anomalies In QFT 

An interesting pathology of QFT which has rich topological overtones is the problem of anomalies 
which originated in the famous ABJ (1969) paper to resolve the problem of 7r° 77 decay 

whose hitherto standard explanation in terms of partial conservation of axial current {PC AC) 
used to fall far short of experiment. The ABJ paper finally resolved the issue by introducing 
an "anomalous” amplitude proportional to in the PC AC relation, whose interpretation 

brought into focus the pathology of symmetry *- breaking at the classical level through such 
"anomalies" at the QFT level. Such ‘violation’ of gauge symmetry through ‘anomalies’ points 
to the need for their cancellation, which in turn constitutes an important constraint for physical 
gauge theories with rhiral coupling to fermions. In this respect, "global chiral anomalies" play a 
key role in the understanding of physical effects associated with topologically non-trivial gauge- 
field configurations, via the celebrated Atiyah-Singer Theorem. This subject is briefly reviewed by 
Haridas Banerjee [10] in this Book. 



10 


A N Mitrn 


3.3 Coherent States In QFT 

Still another sector of QFT with topological (geometric) features, is the sul)ject, of Cohnrnt 
States which has grown rapidly since its birth 36 years ago at the hands of C«lauhei and Su- 
darshan [R.J.Glauber, Phys.Rev.130, 2529 (1963); E.C.G.Sudarshan, Phys.Rev. I.ett.lO, 277 
(1963)], although the basic idea dates back to the founder of Quantum Mechanics himself [Erwin 
Schroedinger:Naturwissenshaften, 14, 644 (1926)] in connection with the (luantum states of a har- 
monic oscillator, i.e., almost immediately after the birth of (imintum mechanics. Coherent States 
have 3 main properties; coherence, overcompletene.s.s and int rinsic geometri7,al.ion, all of which 
play a fundamental role in QFT. These include the calculation of physical procf'sses involving in 
finite number of virtual particles; the derivation of functional int.egrals and various effective field 
theories; and last not least, the exploration of the origins of topologically non-trivial gauge fields 
and the associated (gauge) degrees of freedom. All tlu'se topics are addr(‘sH<'d systematically in a 
perspective, self-contained review by Wei-Min Zhang [1 1]. 

3.4 Pancharatnam-Bargmann-Berry Phase 

An outstanding example of a topological aspect in quantum mechanics (whi<;h may he t erimsl ‘field 
theory with a finite number of degrees of freedom’), is providrsd by the existema* of a "geometric 
phase” in quantum theory which had remained obscured from public view until rather recently whcui 
M.Berry (1984) drew attention to it under the term ’’quantum adiabatic anholonomy”. Ili,storically, 
however, the existence of this pathology in physics had first be<!n noted by S.Pancharat.mun (1956) 
in the regime of classical polarization optics, but this inqxjrtant work ha<i somehow gom* by 
default. A similar fate befell a second attempt by V.Bargmann (1964) to r<(.surrect this i<lea in the 
context of Wigner’s theorem on the representation of syrnnu’try operators in <juantum mecljanic.s. 
It was only after the work of Berry that its full implications were apprtatiated within the physics 
community, but its connection with the Pancharatnam an<l Bargmann phiises was liTt unattemieti. 
In an instructive Article, N Mukunda [12] describes these dcivelopments in a proper jjerspective 
by emphasizing the mutual connections among these ideas. He also de.scribes the suhstsiuent 
developments to date, by relating these phases to the prestuice of a complex vector .space and the 
effect of group action among them. He then goes on to show that tlie g<(om<‘tric phase is the 
simplest invariant expression under certain groups of transformation acting oji curves in Hilbert 
space. 


3.5 Skyrmion Model for Confinement 

A confinement mechanism with topological overtones is offered by the large N,- limit of Q(T) which 
has played a crucial role in unifying its premises with a solitonic, hmiron-lnwed approiwh that i.s 
known as the Skyrmemodel which was discovered by Skyrrne (1961), ju.st Ix'fore rjuarks (1964) 
were born. Skyrme’s novelty was to provide a model in which the fundamental fields consisteti 
only of pions, wherein the nucleon was obtained as a certain classical configuration of pion fi<*lds. 
The apparent contradiction of making Fermi fields out of Bose field.s was resolved by deinamiing a 
non-zero ’’winding number” for this (classical) field configuration, thus giving tin* ’’Skyrmion” the 
status of a topological soliton, which is a solution of a classical field equation with lociilizrui t'uergy 
density. 

On the face of it, the Skyrme scenario looked so different from the coiwuitional pict\ire of 
nucleons as a ‘white’ composite of 3 ‘colored’ quarks bound together by their int<?ra<'ti<mH with 
(7(3) gauge fields, that a reconciliation between the two pictures appeared rather remote. It 
turned out however that the Skyrme model could be a plausible approximation to the orthodox 
QCD picture, one in which a key role is played by the large Nc limit of the latter. The logic goes 
roughly as follows. 

Despite the increasing strength of QCD at low energies, it is plausible that the pseudoscalar 
mesons ^ qq composites, could still interact relatively weakly with each other, thus permitting the 
formulation of some effective Lagrangian for the pions, subject of course to the corr(H:t symmetries 
of the underlying gauge theory, which includes a (spontaneously broken) chiral SU{Nf)<»SU{Nf) 



Editorial Summajry 


11 


flavour (Nf) symmetry that allows ‘massless’ pseudoscalars to co-exist with massive scalars. An 
effective Lagrangian on these lines may be obtained from ”a non-linear realization of chiral sym- 
metry”, without the explicit appearance of scalars, a structure which has an uncanny resemblance 
to the very Lagrangian obtained by Skyrme (1961). 

How about the baryons in this Q CD-motivated ” chiral perturbation theory” picture ? It is here 
that t’Hooft’s (1974) large Nc limit comes into play, with the proportionality to Nc for the baryon 
mass being the signal that the baryon state under study is a soliton of the effective meson theory 
initiated by Skyrme. In a perspective review of the Skyrme model approach, Joseph Schechter 
and Herbert Weigel [13] trace its connection with QCD in the large Nc limit, and discuss the 
properties of light baryons treated as solitons, within the framework of an effective Lagrangian of 
QCD containing only meson degrees of freedom. 


4 Formal Methods In QFT: Selected Topics 

The universal language of QFT and its powerful techniques broke fresh ground through the estab- 
lishment of the equivalence of its tenets with those of Statistical Mechanics which had traditionally 
been developed on entirely ‘classical’ lines. In the words of A.M.Tsvelik (QFT in CMP, Camb.Univ 
Press 1995), this equivalence may be succinctly expressed by the following statement: ” QFT of 
a jD-dimensional system can be formulated as a statistical mechanics of a (D + l)-dimensional 
system. This equivalence .... allows one to get rid of non-commuting operators and to forget 
about time ordering, which seem to be the characteristic properties of quantum mechanics....”. 
The Path Integral formulation of QFT which is the key element in dispensing with the problem of 
non-commuting operators in QFT, has had a crucial role in bringing about this vital correspon- 
dence of QFT with the partition function in quantum statistical mechanics (QSM). Armed with 
the powerful techniques of Renormalization Group Theory (RGT), this new approach has opened 
up a whole vista of applications to new emerging areas like critical phenomena in condensed matter 
physics. 


4.1 Unified View of QFT and QSM 

An important outcome of a unified view of QFT and Quantum Statistical Mechanics has been 
the emergence of two new areas: Euclidean Field Theory, and Finite Temperature Field Theory. 
Actually the origins of the former date back to the Fifties at the hands of Wick (’’Wick rotation” 
for the Bethe-Salpeter equation) and Schwinger (as a possible direction for the evolution of QFT), 
wherein the transition from Minkowski to Euclidean space (via analytic continuation from real to 
imaginary ’’time”) was perceived as a means of curing many ills in QFT, such as positivity and 
finiteness of norms in the computation of physical quantities. In more recent times, the Euclidean 
formulation of QFT has led to an interesting relationship between ’’stochastic mechanics” (Nelson) 
and the Feynman-Kac formulae for Green’s functions expressed as path integrals. In a crisp Article 
in this Book, R.Ramanathan [14] provides a formulation of QFT in Euclidean space-time, to bring 
out the basic ideas of the Euclidean formulation, as well as the above relationship between the 
Nelson and Feynman-Kac formulations. 

Finite Temperature Field Theory on the other hand, (in contrast to zero temperature for Eu- 
clidean QFT), provides access to a much wider class of complicated quantum mechanical systems, 
and addresses questions like thermal averages in QFT, symmetry restoration in theories with spon- 
taneous symmetry breaking, and indeed the evolution of the universe at early times (from the high 
temperature phase). More recently, chiral symmetry-breaking phase transitions, especially the 
’’confinement-deconfinement” phase transitions in QCD leading to quark-gluon plasmas (QGP), 
have acquired great interest in view of planned experiments on heavy ion collisions to detect 
(QGP), A few selected topics in Finite Temperature Field Theory are treated in an informative 
Article by Ashoke Das [15] in this Book. 



12 


A N Mitni 


4.2 Integrable Systems: Toda FT 

Although most approaches to QFT have been traditionally associated with linear partial differential 
equations, (e.g., Schroedinger, Klein-Gordon, Dirac, Proca), non-ffnmr equations, (i.e„ (Hpiations 
where the potential term is non-linear in the field 0), have also been known for some time. Among 
the earliest non-linear wave equations known in physics are the Liativillc and Sinr - (j onion 
equations. The Liouville equation in 2D arose in the context of a s(uirch for a manifold with 
constant curvature, something like covering the .surface with a fishing mh, whose' arc length is 
constant (knots do not move!), while the ‘threads’ in the net correspond to a kx'al coordinate system 
on the surface. The ’’field” ^ in the Liouville equation is tlie pluise spact! density p satisfying th.' 
equation d^dyp = expp, where x,y are the local orthogonal coordinattjs. Tin' Sine-CIordon {S(7) 
equation has a similar structure, with exp replaced by sin on the H.IIS. 'Variants of these e<iuations, 
e.g., adding a ‘miiss’ term 7n^4> on the LHS, and/or the hyperbolic r<!placement of sin by sinh. etc, 
give rise to several more varieties of similar typos. A third type of non-limuir e<iuation which has 
received much attention, is the so-calle<l KdV equation uj - 6uiix + Uxxx = 0- with interesting 
properties like an infinite number of conservation laws. Th(( corr<!sponding conserved (pumtities 
can be used as Hamiltonians for an integrable syst(!m {KdV hierarchy). A striking featuri' of siicli 
non-linear equations is an infinite number of conscirved (piantities, whic-h imply that tlu' solutions 
of these systems must be infinitely restricted. This results in such solutions being (juiti' stable 
structures (solitons) which retain their shapes even after collisions. 

An interesting class of coupled non-linear e(iuations was introdu(x;d by M.'Foda (li)(i7) to de- 
scribe a ID crystal with non-linear coupling betwecm nearc'st luuglibour atoms. 'I'hese (lattice) 
models also admit soliton solutions which reduce to the KdV o<iuation in the (umt iniiinn limit. At 
the ‘field’ level, such models (with exponential ‘pottmtials’) simulatr! a g('n(!ral class of non-linear 
equations-called Toda Field Theory which include the Liouville an<l .Sine-Clonlon ('(piations as 
special cases. For the solution of these equations, a grun'ral mc'thod of ’’inver.st* .scattering” was 
proposed by Gelfand-Levitan. The logic of this metliod is to convert, via a suitalile transformation, 
the original non-linear equation to an ecjuivalent linear equaticm, and study the (‘vidution of the 
latter, more or less according to standard nuithods already develop(;d for them (including grou}>- 
theoretic. Lie-algebraic, etc methods). The inverse scattering method jjaved the way to comu'ctions 
with other known models of QFT, such as conformally invariant FT and t.h(> Hamilttmian reduc- 
tion of 'Wess-Zumino-Witten model. Similarly the KdV equation is rrdatrul to the* 4D Yang-mills 
theories, thus providing a connection of the latter with 2D int<^grable models. In an inst ructive, 
self-contained article on this subject, Bani Sodermark [16] gives a persp<!ctiv(‘ vit'w of integrable 
systems with special reference to the Toda Lattice hkuarchy, and reveals the conn('ct.ions of such 
non-linear field theories with other sectors of QFT. 

4.3 Light-Front Dynamics 

Dirac laid the foundations of QFT, not only through hi.s famous Equation, Imt at least wit h 2 
more seminal contributions within a year’s gap from each other: a) light-front (LF) ([uantizat ion 
[Rev.Mod.Phys.21, 392 (1949)]; b) constrained dynamics [Can. J. Math. 2, 129 (1950)]. In the for- 
mer, he suggested that a relativistically invariant Hamiltonian theory can be based on different 
classes of initial surfaces: instant form {xo = const); light-front (LF) form (xo + = 0); hyper- 
boloid form (a;^ + < 0) . The structure of the theory is strongly dependent on these 3 surface 

forms. In particular, the ” LF form” remains invariant under 7 generators of the Poincare’ group, 
while the other two are invariant only under 6 of them. Thus the LF form has the maximum 
number (7) of ’’kinematical” generators (their representations are independent of the dynamic.s of 
the system), leaving only 3 ’’hamiltonians” for the dynamics. 

Dirac’s LF dynamics got a boost after Weinberg’s discovery of the P* = inf frame which 
greatly simplified the structure of current algebra. The Bjorken scaling in deep inelastic scattering, 
supported by Feynman’s parton picture, brought out the equivalence of LF dynamics with the 
P, = inf frame. The LF language was developed systematically within the QFT framework by 
Kogut-Soper (1970), Leutwyler-Stern (1978), Srivastava (1998) and others. The time ordering in 



Editorial Summary 


13 


LF-QFT is in the variable r = xq -h X3, instead of t = xq in the instant form. And despite certain 
technicalities, the LF dynamics often turns out to be simpler and more transparent than the instant 
form, without giving up on the net physical content. This is borne out from comparative studies: 
of spontaneous symmetry breaking on the LF ; of degenerate vacuum in certain (1 + 1)1) QFT 
which are exactly soluble and renormalizable (e.g., the Schwinger model and its chiral version); of 
chiral boson theories; and of QCD in covariant gauges. Indeed, the LF quantization of QCD in 
the Hamiltonian form bids fair to be a viable alternative to the lattice gauge theory for calculating 
non-perturbative quantities. Removal of constraints by the Dirac method gives fewer independent 
dynamical variables in the LF formalism than in the instant form; for this reason, LF variables have 
found applications even in String and M-theories. In an instructive self-contained review (with a 
rich collection of references), Prem Srivastava[17] gives a detailed review of most of these topics 
in a leisurely and systematic manner, and leads the interested reader all the way to the frontier 
with several new results. 

4.3.1 2D Field Theory 

2D models in QFT have also been of great interest in the contemporary literature. Such theories 
reveal some remarkable features, such as fermion-boson equivalence, which facilitates the solution 
of fermion-FT in terms of its bosonized version. This concept of bosonization in turn has been 
useful in the understanding of 4D phenomena that can be described by an effective 2D FT, such 
as the demonstration of quark confinement in exactly soluble 2D models [Casher-Kogut-Susskind 
(1973)]. Another important discovery in 2D FT concerns an ’’anomaly-generated” mass [Jackiw- 
Rajaraman (1985)] for the gauge boson in the Chiral Schwinger model. (This mechanism may be 
contrasted to the standard Higgs mechanism for generating the vector boson mass via spontaneous 
symmetry breaking). The ’’anomaly” here stands for the loss of the conservation property due to 
quantum corrections involved in the quantization of the gauge theory. This disease in turn needs 
Dirac’s second weapon for cure: Constrained dynamics. In a short perspective article in this Book, 
Dayashankar Kulshreshtha [18] reviews the constrained dynamics and local gauge invariance 
of several 2D FT models, in both Instant and LF forms, and in so doing, brings out the detailed 
working of the BRST formalism as applied to such 2D models. 

4.4 Constrained Dynamics 

To recall the essential elements of a constrained dynamical system, which includes most systems 
of physical interest (e.g., QED, QCD, Electroweak and Gravity theories), it is characterized by 
an over - determined set of coordinates. These are best kept track of within a Hamiltonian 
formulation, which has a natural place for all the coordinates (canonical and redundant), so that 
the complete set of constraints emerges easily. The nature of these constraints in turn is determined 
by the structure of the matrix of Poisson brackets (PB) of the constraints of the theory, which 
also carries the signature of whether or not the underlying theory is gauge invariant (GI). Thus if 
this PB matrix is singular, then the set of constraints is firstclass, and the theory is GI. On the 
other hand, if this matrix is non- singular, then the set of constraints of the theory is seconddass^ 
and the theory is non-GI. (Indeed this is often taken as a criterion for distinguishing a GI from a 
non-GI system [18]). These GI systems are then quantized under some appropriate gauge choices, 
or ’’gauge fixing” (GF). Now in the usual Hamiltonian formulations of a GI theory under some 
GF’s, one necessarily destroys the gauge invariance, since the GF corresponds to converting the 
first class constraints to second class constraints. To quantize a GI theory by maintaining gauge 
invariance despite GF, one needs the more general BRST (1974) formulation, wherein the theory is 
rewritten as a quantum system with generalized GI, called BRST invariance. This in turn requires 
enlarging the Hilbert space, and replacing the gauge transformation by a BRST transformation 
which involves the introduction of (anti-commuting) Faddeev-Popov ghostfields. This amounts 
to embedding the GI system into a BRST invariant system (but isomorphic to the former), whose 
unitarity is guaranteed by the conservation and nilpotency of the BRST charge. 

Thus the Dirac[Can.J.Math.2, 129 (1950)]-Bergmann [Phys.Rev.83, 1018 (1951)] theory of 



14 


A N Mitm 


Constraints lies at the root of (Hamiltonian) description of interactions in QFT biised on Ad, ion 
principles which, due to the requirements of Lorentz, local gauge, (and/or diffeomorphism) in- 
variances, must employ singular Lagrangians. This is generally adequate for the study of simple 
gauge theories (controlled by some Lie groups acting on some internal spacein Minkcrw.ski s[)ac(‘- 
time), via the covariant approach based on BEST symmetry which, at least for infinitesimal gaug(> 
transformations, allows a regularization and renormalization of the ndevant tfu'orii's within t he 
local QFT framework. On the other hand, the gauge freedom of theories that are inviu-iant uiuh'r 
diffeomorphism groups of the underlying space-time (e.g., in general relativity or string theory) 
is encumbered by the arbitrariness for the observer in the "definitory propertitw" of space-time 
and/or the measuring apparatus; [see L.Lusanna-this Book]. Sucli ambiguiti(\s affect bigger issues 
like: the understanding of finite gauge transformations; the Gribov ambiguity in the choice of 
function space for the fields; proper definition of relativistic bound staters vis-a-vis (piark confine- 
ment; and last not least the conceptual and practical irroblems posed by gravity. These rtsjuirt' a 
fresh look at the foundations of QFT to know if we: i) understand the physical df'gr(‘es of fi(«e 
dom hidden behind gauge and/or general covariance; ii) can meanfully reformulate t,h(‘ pliysi<'s 
(both classical and quantum) in terms of them. Logically this would amount to abandoning lo<-al 
QFT for non-perturbative interactions, and a reformulation of relativistic theories to allow natural 
coupling to Gravity. These and allied issues are addressed in a state of the art review by Luca Lu- 
sanna [19], aimed at a unified reformulation of the 4 basic, interactions in terms of Dirac-Bergmann 
observables, with emphasis on the open problems mathematical, physical and interpretational. 


5 Extension Of QFT Frontiers 

A long term ambition of QFT has been the dream of unifidcation of all the gauge fiedds with tin- 
Gravitation Field whose quantization has all {dong posed a big challenge in its own right. [A uuijor 
difficulty in the way of unification of this sector with the other three, {is wiis once succinctly imt by 
Abdus Salam, lies in the ’’spin mismatch” of thisir respective fields (vector vs tensor), whiidi would 
militate against a common strategy]. Nevertheless such a unification wius to comi' alxiut from an 
entirely new paradigm which envisaged extension of the original tenets of Field Th<>ory biisc'd on 
a point particle description to one with Strings. In this Stiction we offer a p{inor{imic view of 
some major theoretical developments from seemingly unrelated angles, whic:h, {ip{irt from tludr 
impact on Physics in their own right, have provided some key ingredients converging biwiirds the 
emergence of modern StringTheory. These developments which may be termed Sujx'.rspmmefr.v 
(SUSY), ConformalFieldTheory (CFT), cind Duality, are outlined next. 

5.1 SUSY In Field Theory 

In its march towards Unification, Field Theory has continued to bre{ik new ground in sever{d <iir«‘c- 
tions. An important step in Unification was marked by the discovery of Supersymmetry {SUSY), 
introduced in the early seventies by a galaxy of authors in the context of 2D QFT (Gi^rvais-Sabita) 
as well as in 4D QFT (Golfand, Likhtman, Akulov, Volkov, Wess and Zurnino), for {i unific'd un- 
derstanding of the two known forms of matter-bosons (integral spins) and fermions(h{Uf iuU'gral 
spins)-hitherto regarded as two distinct field types, with commuting and anticommuting proper- 
ties respectively. The new symmetry between bosons and fermions may be incorporated within 
the definition of a single ’’Superfield”, with transformations inter-relating the two constituents, so 
that SUSY becomes a part of space-time symmetry implied by relativistic invariance. The Gauge 
principle too admits of a corresponding extension to unify both these sectors. 

What are the motivations for such a lavish extension of space-time symmetry ? Apart from 
its aesthetic appeal, there are some theoretical considerations of a more concrete nature which 
are dwelt on in this Book through two complementary reviews of SUSY in Field Theory, (with 
special reference to Particle Physics), by two leading experts in the field: Rabi Mohapatra [20] 
and Norisuke Sakai [21] respectively. According to Sakai [21], the most important motivation 
for SU SY is the Gaugehierarchyproblem showing up via the vastly different mass scales of the 



Editorial Summary 


15 


electroweak (Mw) vs the ’’GC/T-theoretic” {Mq): M^/Mq 10“^®. A similar gap exists between 
the ”GUT” vs Planck (gravity) mass scales: Mq/M^i^ 10“^. 

To account for this phenomenon, it is necessary to invoke a suitable Symmetry reason which 
may be precisely formulated by the so-called ’’naturalness” hypothesis (t’Hooft 1979) which de- 
mands that a system acquires a higher symmetry as a certain (small) parameter goes to zero, 
e.g., chiral symmetry occurs when a (small) fermion mass goes to zero; or a local gauge symmetry 
corresponds to the vanishing of a vector boson mass. Now the mass scale Mw of weak bosons 
arises from the vacuum expectation value < (j) >o= u 0, related to the mass Mh of the Higgs 
scalar field (j>. So to regard the gauge hierarchy problem as the result of some symmetry breaking, 
we must give a Symmetry reason to make the Higgs scalar mass vanishingly small. Classically a 
vanishing scalar mass corresponds to a symmetry called scale invariance, which however cannot 
be maintained quantum mechanically. In a perspective review on this subject, Norisuke Sakai 
[21] argues for ’’Supersymmetry” between the Higgs scalar and a spinor partner as a good option: 
Chiral symmetry gives zero mass to the latter, while SUSY makes the former massless (through 
a cancellation of the respective contributions to the self-energy loops). 

In a complementary perspective review on the same subject, Rabi Mohapatra [20] stresses the 
versatility of SUSY as a tool for understanding many unsolved problems of physics: a) improve- 
ment in the singularity structure of local fields for understanding the disparate scales of Nature 
(e.g., Electroweak vs Gravity); b) possibility of unifying Gravity with the other forces by making 
SUSY local instead of global; c) prospects of understanding non — perturbative properties of field 
theories, hitherto considered ‘impossible’ in non-5i7 SY form. 

As to the manifestations of SUSY in a real world, this ” Bose-Fermi” symmetry is supposed to 
be badly broken, so that any search for superpartners (bosons vs ‘bosinos’; fermions vs ‘s-fermions’) 
has so far yielded zero dividends. On the other hand the formulation of Supersymmetry in non- 
relativistic quantuni mechanics is relatively free from constraints. Indeed, since Schroedinger (1940) 
noticed the existence of well-defined ’’supersymmetric partners” for the energy levels of a given 
quantum mechanical system, many applications to such systems (including nuclear and condensed 
matter physics), have kept pace with the rapid strides of SUSY in field theory in recent years. 
Indeed, the existence of SUSY partners in the energy levels of (appropriately chosen) even vs odd 
nuclei have been systematically established by group theoretic methods (interacting boson models, 
etc). Similarly, in solid state physics, an interesting correspondence has been observed between the 
critical behaviour of a ‘spin system’ in random magnetic fields in d dimensions, and that of the spin 
system without the random magnetic field in d - 2 dimensions. This ’’dimensional reduction” may 
be traced to an underlying SY SY for the spin system in random magnetic fields (see N Sakai-this 
Book). 

In the absence of discovery of SUSY partners in Field Theory, the benefits from SYSY have so 
far been purely theoretical, varying from reduction of the degrees of divergence arising from various 
loop integrals in standard field theory (by at least two orders), to a heavy reduction in the number 
of dimensions (from 26 to 10) needed for self-consistency in a string theoretic formulation. The 
Articles by Mohapatra [20] and Sakai [21] between them provide quite a complementary description 
of the SUSY formalism in QFT, together with an glimpse of the recent developments. And apart 
from its applications in particle physics, this formalism also serves as a background to the vast 
field of supersymmetric string theory. 

5.2 CFT 

An independent insight into the origin of String Theory comes from the role of Conformal Field 
Theory (CFT), viz., conformally invariant QFT in 2D(imensions), not only as as a vital ingredient 
of its anatomy, but also with firm hold on other disciplines like condensed matter physics. The 
CFT route to the evolution of String Theory is sketched in this Book as part of a bigger (historical) 
survey by Werner Nahm [22], tracing a whole sequence of developments in QFT right from its 
(Dirac) beginning, and encompassing in the process several other areas of physics on which CFT 
has had a decisive impact. In this saga, the interplay of physical intuition and mathematical 
rigour has brought together the practitioners of these respective disciplines, though not necessarily 



16 


A N Mitrii 


working in tandem. On the one hand, the beauty and transparency of OFT have imuie for a ricii 
variety of intellectual exercises in abstract mathematics (with new emerging areas like aufomorphic 
groups, Kahler-Einstein metrics, etc), and on the other, facilitated the study of int('nsely practical 
physical systems such as continuous phase transitions in condensed matter physics. 

The impact of CFT on string theory has had its origin in several tlu'oretical d(>velopments: the 
Thirring model in 2D; Skyrme’s idea of the equivalence betwism Fermions and Bosons; Cohunan’s 
equivalence theorem on the Thirring Model versus the Sine-Gordon e(}uation (desi)it(> their aiiparent 
dissimilarity); and the role of conformal invariance in the structure of Wilson's Renormaliziition 
Group equations. To recall the essentials of Conformal invariance, this symmetry is sjit islied in tlie 
absence of any ‘scale’ dimension. Examples are Maxwell’s Equtitions in free space; Dirac e<iuatiun 
for massless fermions which satisfy conformal invariance. Tlu' 2D Thirring model, whi<-h may be 
regarded as a basic ingredient of string theory, also has this property du(' to iibsence of a s<'ale 
dimension. Using this mathematical picture, the string may be rtigarded as a ID object in sirai'c 
spanning a world sheet (a Riemann surface) embedd<!d in 21) space-tinun where a i>oint on the 
string is represented by X^{a,T); a, r being the 2 world sheet coordinates. 

The impact of CFT has been no less impressive in the domain of (xuidensed matter physics 
(CMP) where there exist a rich class of QFT’s exhibiting the structure of conformally invari- 
ant fields, such as in 2D surface coatings. Thus at a critical temperature ('/’,.), the long range 
fluctuations of arbitrary scales make irrelev<mt the details of molecular structure, and the theory 
approaches a continuumlimit, with no visible scale dimension to kt'cp track of. Indee<i in this 
limit, the correlation functions behave like the Euclidean n-point functions of standard QFT with 
conformal invariance properties. Nahnr [22] discusses an intfiresting corresiromhmce lietween the 
Ising model in CMP and Thirring model in QFT. The (iquation sat,isfi(!d by the spin wavr's of tlu* 
Ising model is formally identical to the 2D Dirat; equation for mtisshiss fermions. Indetul condensed 
matter physics provides a more stable and economical background for t(*sting the.se ideius Ihiin the 
expensive HEP laboratories ! 

5.3 String Theory Via Duality 

Perhaps the most startling ’’revolution” in Physics to clatt! which had its origin in QFT, has 
been the String Theory, and its successive ’’Avatars” (incarnations), aimed at unifying all the 
forces of Nature (from orthodox gauge theories of strong, e.m. and wtuik interi«1,ion.H, all the 
way to gravity). An orthodox route to its evolution may be attributed to the strong interartion 
problem in QFT, which has had wide ramifications from vastly different angles, each provitiing an 
independent insight into its mysteries. A very promising approach to strong interactions came from 
the Duality Principle which has had a long history (prehap.s traceable to the Bootstrap hypothe.si.s), 
based on the equivalence of the direct channel (re.sonances) and crossed channel Hegge pole,s with 
a universal slope a' « lGeV~^. An explicit realization of this idea was achi(;v<!(l via the Wne/.iano 
representation for 4-point amplitudes satisfying the requirements of duality and crossing symnn'try, 
which was soon generalized to iV-point amplitudes satisfying the same propcutita. Through a path 
integral representation of such amplitudes, Nambu, Nielsen and Susskind recognized that the.se 
amplitudes describe a ID (string-like) object moving in space, with the inver.se of tlu; univtTsal 
Regge slope identified as the ’’string” tension T. The ’’string” interpretation was further r<nnforce<l 
by a subsequent representation due to Virasoro, with very similar propertic;s. Ami its promise* of 
relevance to particle physics (despite stiff competition from QCD!) got a boost from the* Scht'rk- 
Schwarz (1974) observation that such a ’’string theory” could serve as a candidate for incorporating 
gravity in its ambit, on the ground that the massless spin-2 particle appears naturally in the* clo.se<l 
string spectrum. To that end the string tension T needed to be increased by 19 orders of magnitude 
(up to the Planck scale!) to qualify for a viable theory of gravity. The conceptual gap was finally 
bridged by the seminal work of Green-Schwarz (1984) who succeeded in constructing a consistent 
lOD super Yang-Mills theory coupled to supergravity which is free from anomalies only for certain 
gauge groups (50(32) or Es x Es). This work, perhaps for the first time, showed real prospects 
for unifying the fundamental forces. 

The String Theory has grown by leaps and bounds during the past decade, and its vast ram- 



Editorial Summary 


17 


ifications have grown to such formidable literature over the past decade, that a minimal justice 
to it would itself require several volumes of review. Nevertheless, after a short overview by the 
Master, John Schwarz [23] of the subject, a panoramic account of the major developments in 
this exciting field (together with an exhaustive set of references) is given in a perspective Article 
by Jnanadeva Maharana [24]. Schwarz [23] views the different superstring theories (and an 
extension called M - theory) as different facets of a unique underlying theory going beyond ordi- 
nary QFT’s. However, recent duality conjectures suggest that a more complete definition of these 
theories may come from the large N limits of suitably chosen U{N) gauge theories; (see L Bonora 
[25] below). The Maharana Article [24] leads the interested reader through several stages of its 
development, from i) perturbative aspects of ST] successively through ii) Duality Symmetries as 
a characteristic of String Theories (ST); iii) M-theory as a unified view of the five perturba- 
tively consistent ST's] iv) microscopic understanding of Black Holes, and so on, all the way to the 
frontiers of the field. 

Attempting to cover the later stages of development in this rapidly growing field, Loriano 
Bonora [25] reviews some advances in the study of the relation between Yang-Mills (YM) the- 
ory and strings, based on the classical YM-theory solutions {Riemannian instantons) which are 
2D solutions describing Riemann surfaces in the strong coupling limit. Strictly, such relations 
historically date back all the way to ’t Hooft (’74) through his famous IjNc expansion for large 
Nc, wherein the dominant Feynman amplitudes correspond to the 2D Riemann surfaces. This 
‘natural connection’ with strings was subsequently upgraded to a concrete shape via studies of 2D 
QCD (for string-like properties), which was further generalized to a connection between conformal 
super-FM and super-string theory of type IIB, in the large Nc limit. The Bonora Article reveals, 
among other things, a direct link between String Theory and non-abelian YM theory, through the 
emergence in the latter of classical solutions modelled over Riemann surfaces, leading to a ’’string” . 
interpretation. Historically, this came about only after the proposal of the MatrixTheory, which 
in the large Nc limit converges to the (non-perturbative) M - Theory, 


6 CS Field Theory And Condensed Matter Physics 

While the dominant concern in Field Theory has been in the traditional domain of particle physics, 
its powerful language and tecnniques have found profitable employment over a much wider domain, 
which comprises topics in Condensed Matter Physics, and newly emerging fields like Quantum Hall 
Effect, fractional statistics and Anyons. These phenomena lend themselves to QFT treatment in 
(2-}- 1) dimensions, where the celebrated ” Chern-Simon” {CS) term plays a key role (see also Part 
B on Topological Field Theories). 

What are the special features of QFT in (2 -f !)£>, and what specific role does the CS term 
play in this reduced space-time continuum ? Perhaps the most striking feature is the appearance 
of fractional statistics ! For, whereas in 3 (or higher) space dimensions, all particles must either 
be bosons (integral spin) or fermions (half-integral spin), in 2 space dimensions, the particles 
can have any fractional spin/statistics with impunity ! Such particles are called Anyons. Now 
since the usual spin-statistics relation follows from the premises of the standard 4D relativistic 
QFT, it is natural to ask if Anyons can be understood from the corresponding 3D QFT. The 
question goes beyond mere academic interest since lower dimensions can be effectively realized in 
the physical world through the ’’freezing” out of certain degrees of freedom, (e.g,, in a strongly 
confined potential, or at low enough temperatures), so that these ‘quasi-particles’ may well exhibit 
anyon-like properties. And indeed experiments on Quantum Hall Effect (QHE) have revealed the 
existence of fractionally charged excitations (thus implying anyons). 

A critical discussion on the question of anyons and fractional statistics in (2-Fl) dimensions, with 
particular reference to the role of the Chern-Simons {CS) term in 3D QFT, is given by Avinash 
Khare [26] in a perspective Article on the subject in this Book. To that end, Khare clarifies 
the definition of ’’quantum statistics” which relates to the ’’phase” picked up by a wave function 
when two identical particles are adiabaticallyexchanged, as distinct from the usual definition of 
permutation symmetry for two identical particles. [While both definitions coincide for 3 and higher 



18 


A N Mitrn 


dimensions, they differ in 2 dimensions]. He then discusses in detail the main properties of the CS 
term, especially its role as a gauge field mass term, in whose presence anyons can appear in one of 
two different ways: i) as a soliton of the corresponding QFT ; or ii) as fundamental ciuanta carrying 
fractional statistics. So far, the state of the iirt is based on non-relativistic QFT, wlierein the CS 
term provides an effective cushion against a non-local formulation of anyon fields, thus facilitating 
a ‘local’ formulation. However a full-fledged relativistic QFT formulation is not yet fe.-isihle. 

Perhaps the most tangible success from CS fields so far is a natural understanding of the 
Quantum Hall (QH) Effect. A state-of-the-art review by R Rajaraman [27] puts this tojiical 
subject in perspective. We summarize some essential featurc^s of a QH system, from his own 
account. A QH system which is defined as ”qu<isi 2D layers of electrons tr;ii)ped in t in' interftice 
of semi-conductors, at very high magnetic fields and very low temperatures, h.is revealed many 
remarkable features ”. Particularly interesting is the presence of certain state's < haract.erizj'd In- 
the so-called ’’filling fractions” (i/) which are either integers, or certain mlddnimninator fract ions; 
u = hcpjeB, where p is the mean electron demsity, and D the ai)pli(!d fit'ld. 'I'he special states 
corresponding to these i/- values show extremely fiat platetius in Hall conductivity whi<’h (in tinits 
of e^/h) are exactly equal these values to within an accuracy of 1 in 10^ ! These featur<'s are very- 
universal inasmuch as the details of the material seem irrelevant. It was etirlier recogni-/,(>d tlnit the 
electrons in these QH states form an incompressible fluid, described by ” L;iughlin wave fmict nnis” 
(which are reminiscent of Jastrow-type correlations in nuclear wave functions). A mort' iuialytical 
study of these empirical functions suggested a Landau-Ginsberg type scenario for tin* QHF. in 
terms of an order parameter field (subseqtiently to be identified with a Chern-Simons field), thus 
formally bringing this subject within a 3D QFT network. 

The analogy of the order parameter field in QHE to that obtaining in supercondut-tivity of tin* 
Landau-Ginsberg description, is of counse not a literal one since there are no bosonic Gooper pairs 
in QHE. Indeed in this 3D QFT scenario, the ’’anyons” (Clu'rn Simons fields) have an internu'diate 
status between bosons and fermions. However for th<! special cas*' wh<>n tin* anyon angle is an o<id 
multiple of tt, a composite of the electron with an o<ld number of flux tubes, i'ffe<-tively amounts to 
constructing a ’’bosonic” analogue of Cooper pairs from out of ’’fermions” which now provides t he 
desired order-parameter (CS) field operating in the plateati of th(' QH system. Rajaraman reviews 
a formal QFT procedure for constructing such CS gauge fields, as w<dl as tin' fornmlation of their 
dynamics at the 3D QFT level. As to the connection of the CS gatige fi<'lds at the first <iuanti/.ed 
level, these are of course expressible in terms of the ’’phase angles” involved in the exclrange of 
electrons in an N- electron wave function in 2D (see also Khare [26] in this Book). 


7 QCD-Motivated Strategies For Strong Interactions 

Turning now to the strong interaction problem in the standard field theoretic pictun', it.s prim<' 
candidate, QCD, has since its birth been beset with problems of reliable calculational techniejues to 
deliver results. An introductory overview of several approaches [symmetries, effective Lagrangians 
and Wilson expansions] to deduce hadron properties from QCD is sketched in the Article by 
Olivier Pene [28], aimed at establishing a link between perturbative and non-ix'rturbative QCD 
via lattice methods. We now go into more specific details of a few principal QCD-based methods, 

7.1 QCD Sum Rules 

To recall the main signatures of the prime candidate, QCD, which it shares with any non- 
abelian gauge theory, are expressed by a two-fold pattern: i) decreasing coupling strength at 
shorter distances (Asymptotic Freedom); and ii) increasing coupling strength at longer diHtance.s 
{confinement). The former is fairly well understood, and provides a perturbative basis for cal- 
culating QCD effects in high energy processes. In particular, the powerful method of ”QCI) 
Sum Rules”, based on Wilson’s Operator Product Expansion {OPE), was developer! by Shifman- 
Vanstein-Zakharov for the study of non-perturbative QCD in a large variety of applications from 
hadronic masses (with two-point functions), coupling constants, form factons (with three-point 



Editorial Summary 


19 


functions), and reactions (four point functions). The basic philosophy is one of a duality between 
two ways of representing a correlator: i) OPE with various ’’twist” terms (vacuum condensates, 
treated as free parameters of the theory) representing successive non-perturbative corrections to an 
otherwise perturbative expansion; ii) a dispersion formula saturated by certain low-lying hadron 
resonances. Equating the two amounts to evaluating hadronic parameters in terms of the quark- 
level condensates. Despite certain conceptual problems of ’’microscopic causality” encountered 
in the ’’matching” of two sides of the equation, this method (QCD-SR) has proved very popular 
among a wide class of high energy phenomenologists, and has been continually refined over the 
years. A leisurely review of the state of the QCD-SR art on the quark structure of hardons, as 
well as its working on the problem of hadrons in nuclear matter (at finite temperature) is given by 
Leonard Kisslinger [29] in this Book. 


7.2 Non-Perturbative Methods With QCD Features 

The state of the art in this field is so diffuse that a more organized exposition is needed for such 
methods. To that end the attempts at addressing the strong interaction problem in QCD may 
be divided into two broad categories: i) soluble models designed to shed light on its general fea- 
tures through exact calculations; and ii) effective Lagrangian methods for 4-fermion interactions, 
somewhat reminiscent of the Bethe ’’Second Principle” Theory of effective nucleon-nucleon inter- 
actions of the Fifties. Srivastava [17], as well as Kulshreshtha [18], in Part C of this Book, have 
already provided a flavour of the results to be expected from type (i) theories, using the method 
of LF-QFT. 

Type (ii) which deals with more realistic situations, albeit at the cost of some phenomenology, 
has a much wider literature to choose from. To do a semblance of justice to this field, this Book 
includes two articles of this type, reviewing the methodology and working of such QFT-based 
approaches. The first one, by Vladimir Karmanov [30], gives an in-depth review of covariant 
light-front (LF) dynamics, with applications to field theory and relativistic wave functions. The 
formalism is effectively 3D in content, which can be obtained by projecting the (4D) Bethe-Salpeter 
amplitudes on the light-front plane, and although a reversal of steps is not possible to reconstruct 
the 4D BS amplitude, the LF formalism still represents a powerful alternative for solving QFT 
problems. Karmanov [30] also discusses some typical applications. 

7.2.1 Markov- Yukawa Trans versality Principle 

The second article by Asoke Mitra [31] offers a comparative view of the state of the art in several 
QFT approaches based on effective 4- fermion interactions (including QCD features), of both 3D and 
4D types (Tamm-Dancoff, Bethe-Salpeter, Salpeter, quasi potentials, light-front). In this context, 
attention is focussed on an important but somewhat less known principle called ’’Markov- Yukawa 
Transversality” (MYTP) which decrees that the interaction between the two (quark) constituents 
be transverse to the composite (hadron) 4-momentum, by virtue of which the BSE kernel has an 
effective (albeit covariant) 3D support. As a result of this ’’Covariant Instantaneity” the starting 
4D BSE is exactly reducible to a 3D form, and conversely the steps can be reversed so as to 
allow an exactreconstr nation of the original 4D BSE in terms of 3D ingredients ! Thus MYTP 
allows an exact interlinl^age between the 3D and 4D BSE forms, so that both forms can be used 
interchangeably, unlike most other approaches in the literature which employ either a 4D or a 3D 
form of the BS dynamics, but not both simultaneously. 

It might be of some historical interest to note that the Salpeter equation has a 3D structure 
stemming from its (instantaneous) kernel with a 3D support, and therefore its original .4D form 
can be recovered a la MYTP by reversing the steps, but this possibility had never been explored. 
This gap is now filled by MYTP which provides a formally covariant basis for the instantaneous 
approximation. The same principle (MYTP) can also be generalized from covariant instantaneity 
to the covariant light-front. 

A fall-out of the 3D-4D interlinkage provided by MYTP is that it gives a two -tier description: 
the 3D form for the hadron spectra which are 0(3)-like; and the 4D form to address the transition 



20 


.‘V N A/ifjvi 


amplitudes as 4D loop integrals using standard (4D) Feynman rules. Tins Principle can he easily 
incorporated in the usual framework of coupled Beth(>-Salp('ter and Schwinger- Dyson (><iuations 
(BSE-SDE) stemming from a (chirally invariant) 4-fermion Lagrangian wit.h <-urrent quarks inter- 
acting via the full gluon propagator, so that the <iuark mass is acquired via t he N.IL-mechanism. 
And the generalization from covariant instantamuty to tlu' covariant light-front, hel|)s remove cer- 
tain problems of Lorentz mismatch of vertex fuiudions that arise in a 41) loop int(>gral uinler (lie 
covariant instantaneity ansatz. These and other details an' n'viewed in the article by Mitra (,■$1] 
which also stresses a parallelism of treatment of qq and qqq systems. 

7.3 The Harmonic Oscillator: A Powerful Bridge In QFT 

No amount of literature on the impact of QFT in Physics would he complet(' without an exposure 
of the role of the Harmonic Oscillator (HO) in shaping Quantum 'Pheory, as an integral part 
of this Book theme. It was therefore a matttu' of great satisfaction when Marco.s Moslrin.sky 
[32], who may be regarded <is the "Either of the Harmonic Oscillator in Pliysics", agreed to 
contrihute a perspective article on tin; HO theim.'. The only obstacle' against a n'gular format 
for his Article was that he had only recently writtmi a comprehensive book on thi' siihji'ct [.\I. 
Moshinsky and Yu. F. Smirnov, Thell a'rmonic('),sci7fator/uA/(ai('7'ni’/ir/.sh',s, (Harwood .Acadc'mic 
Press, the Netherlands, 1990)]. Nevertheless in his Artich', he has provich'd a comprt'hensive list 
of contents of his HO-book, which already offers a glimp.sc' of the dispth ami range of physical 
problems (from the simplest quantum mechanical om's to the n-hody Relativistic O.si-illatori that 
are amenable to the amtizing powers of HO techni<iues in tandem with th(' st.andard method.s of 
Group Theory. In addition he has reviewrxl some naxuit work of his on n'lativistic partii les of 
arbitrary spin in a confining HO pottmtial, with applications t.o Si><'<'troscopy. 

8 Conclusion: Foundations Of Quantum Theory 

The Book concludes with an Article by Dipankar Hornt![33] on the foundations of riuantum 
mechanics (the predecessor of QFT). It is well known that the Founding Fathers of cpiantnin 
theory (Planck, Einstein, and Schroodinger) held deep reservations tihont tin' aderpiacy of quantum 
theory as a complete description of Nature. Since Quantum Held Theory is the apphcutioii of the 
same quantum principles to systems with continuous th'grecis of freedom, it is open to the same 
questions. Indeed the very premises of quantum theory are now incretisingly Iming scrulini/.ed (>y 
relating them to precise experimental studies. Home’s Artich' [33] comx'ntr.'iti'H on two main i.ssues; 
i) the measurement problem in quantum theory; ami ii) (jiiantum non-locality, both ht'ing areas of 
active research which are yielding new and unexpected results. He concludes with a tpiotaf ion frimi 
John Bell: ’’It seems to me possible that the continuing anxic'ty about wliat tpianttim mechanics 
means or entails will lead to still more tricky experiments which will ('vi'iituaily find some soft 
spot.” Translated to the QFT level, this looks like an appropriate conclusion for tins Book as well. 


References 

[1] D.V.Shirkov: Evolution Of The Bogoliubov Renormalization Group 

[2] S. Szpigel and R.J.Perry: The Similarity Renormalization Grouj) 

[3] V.Novikov: Quantum Field Theory And The Standard Model - Bird’s Eye View 

[4] P.K.Kabir: Broken Reflection Symmetries 

[5] D.Boyanovsky and H.J.de Vega: Dynamics Of Symmetry Breaking Out Of Eiphlibrium - From 
Condensed Matter To QCD And The Early Universe 

[6] V.N.Gribov (Orsay Lectures): (I) hep-ph/9403218i (II) hep-ph/9404332 ; (HI) hep- 
ph/9905285 



Editorial Summary 


21 


[7] K.Nishijima and M.Chaichian: An Essay On Color Confinement 

[8] R.Kaul: Topological Quantum Field Theories - A Meeting Ground For Physicists And Math- 
ematicians 

[9] E. Witten: Quantum Field Theory And The Jones Polynomial 

[10] H.Banerjee: Chiral Anomalies In Field Theories 

[11] Wei-Min Zhang: Coherent States In Field Theory 

[12] N.Mukunda: Pancharatnam, Bargmann And Berry Phases - A Retrospective 

[13] J.Schechter and H. Weigel: The Skyrme Model For Baryons 

[14] R.Ramanathan: Euclidean Methods In Quantum Field Theory 

[15] Ashoke Das: Topics In Finite Temperature Field Theory 

[16] B.M.Sodermark: Integrable Models And The Toda Lattice Hierarchy 

[17] P.P.Srivastava: Perspectives Of Light-FVont Quantized Field Theory - Some New Results 

[18] D.S.Kulshreshtha: Gauge Symmetry In Chiral Electrodynamics 

[19] L.Lusanna: Towards A Unified Description Of The Four Interactions In Terms Of Dirac- 
Bergmann Observables 

[20] R.N.Mohapatra: Supersymmetry And Particle Physics 

[21] N, Sakai: Supersymmetry In Field Theory 

[22] W.Nahm: Conformal Field Theory: A Bridge Over Troubled Waters 

[23] J.H. Schwarz: Superstring Theory - An Overview 

[24] J.Maharana: Recent Developments In String Theory 

[25] L.Bonora: Yang-Mills Theory And Matrix String Theory 

[26] Avinash Khare: Fractional Statistics And Chern-Simons Field Theory In 2 -f 1 Dimensions 

[27] R.Rajaraman: Chern Simons Field And Composite Bosons In The Quantum Hall System 

[28] O.Pene: Hadrons From QCD - Achievements And Prospects 

[29] L.S.Kisslinger: QCD Sum Rules In Hadronic And Nuclear Physics 

[30] V. A. Karmanov: Light-Front Dynamics 

[31] A.N.Mitra: 3D-4D Interlinkage Of Bethe-Salpeter Amplitudes - A Unified View Of QQ And 
QQQ Dynamics 

[32] M.Moshinsky: The Harmonic Oscillator In Quantum Theory - A Powerful Bridge In Physics 

[33] D.Home: Modern Perspectives On Foundations Of Quantum Mechanics 




Part A : Basic Structure Of QFT 

1. Evolution Of The Bogoliubov Renormalization Group by D.V.Shirkov 

2. The Similarity Renormalization Group by S.Szpigel and R.J. Perry 

3. Quantum Field Theory And The Standard Model - Bird’s Eye View by V.Novikov 

4. Broken Reflection Symmetries by P.K.Kabir 

5. Dynamics Of Symmetry Breaking Out Of Equilibrium - Prom Condensed Matter To QCD And 
The Early Universe by D.Boyanovsky and H.J.de Vega 

6. Orsay Lectures On Confinement by the Late Vladimir. N. Gribov: (I) hep>ph/9403218; (II) 
hep“ph/9404332 ; (III) hep-ph/9905285 (Courtesy of Y. Dokhshitzer, Ewarz and J.Nyiri) 

7. An Essay On Color Confinement by K.Nishijima and M.Chaichian 




1. Evolution of the Bogoluibov Renormalization 

Group * 

D.V. Shirkov ^ 

N.N.Bogoliubov Laboratory, JINR, Dubna, Russia 


Abstract 

We start with a simple introduction into the renormalization group (RG) in quantum field 
theory and give an overview of the renormalization group method. The third section is devoted 
to essential topics of the renorm-group use in the QFT. Here, some fresh results are included. 

Then we turn to the remarkable proliferation of the RG ideas into various fields of physics. 

The last section summarizes an impressive recent progress of the “QFT renormalization group” 
application in mathematical physics. 

Contents 

1 Renormalization group primer 26 

1.1 Mathematical preliminaries 26 

1.1.1 Renorm- group folklore 26 

1.1.2 Group Functional Equation 27 

1.1.3 Abstract Formulation 28 

1.2 Definition of the Renorm-Group 28 

1.2.1 The RG transformation 28 

1.2.2 Simple generalizations 29 

1.3 Early history and the RG method 30 

1.3.1 Renormalization and renormalization invariance 30 

1.3.2 The discovery of the renormalization group 31 

1.3.3 Creation of the RG method 32 

1.4 RG in QED 33 

1.4.1 Effective Electron Charge 33 

1.4.2 RG transformations 34 

2 Renormalization group method 35 

2.1 Basic idea 35 

2.2 Differential formulation 36 

2.3 General solution 37 

2.4 RGM algorithm 38 

2.4.1 Technology of RG Method 38 

2.4.2 RGM usage in QFT 39 


* Dedicated to the memory of Nicolaj N. Bogoliubov on the occassion of his 90th birthday. 
‘I’ E . rn ai 1 : shi r kavd(§t hsii n 1 . j i n r . ru 



26 


D V Shirkin’ 


3 RG in QFT 

3.1 UV analysis in general 

3.1.1 One- coupling case 

3.1.2 Multi’ CO tip ling case 

3.2 Perturbative approach to tlie UV asymptote 

3.2.1 Stnicture of RG results 

3.2.2 The ghost-pole trouble 

3.2.3 Scalar quartic model 

3.3 Mass-dependent analytic solution 

3.4 Some important results 


4 RG expansion 48 

4.1 Critical phenomena 48 

4.1.1 Spin lattices 48 

4.1.2 Polymer theory 4!J 

4.1.3 Turbulence • od 

4.2 Paths of R,G expansion 3(1 

4.3 Two faces of RG in QFT 31 


5 RG symmetry in mathematical physics 5 1 

5.1 Functional self-similarity 31 

5.2 Recent application to boundary value problem 32 


Bibliography 


54 


1 Renormalization group primer 

1.1 Mathematical preliminaries 

1.1.1 Renorm-group folklora 

Let us Start with some simple .statements wiiidi can be Hui)p<).sf><i to l>e widely known by particle 
theorists. In the quantum field theory (QFT) the rimormtilwation group (R.(!) is usually jussuciated 
with a possibility of presenting any physical quiuitity, F{Q'^, g), calculates! uiui<;r a dt*finite renor- 
malization prescription in the form F{Q'^ I fi^, g^) (for simplicity in a massless case) with the 
renormalized coupling constant definition attached to .some rimonnalization point (or refeienee 
momentum scale) Q = /j,. Differential RG equation is usually .said to be driven from the condition 
that F does not depend on the choice of /r , 


dF 

(ill 


= 0 . 


( 1 ) 


The coupling constant g,^ dependence on ft is de.scribe(l by a specitir; function known its 

an effective coupling (sometimes - effective couirling constant) g,, = g{fF) . 

Eq.(l) can be written down in the form of a partial linear differential equation (DE) 


4 


F(x,g) = 0 


( 2 ) 


where x — g stands for and [i{g), the group geme^rator, iLsually referred to as beta 

function^ is defined by 

^(9u) = at z = . 



Evolution of Renormalization Group 


27 


The effective coupling g should be considered as a function of two arguments: x = and 

Qn with the boundary condition 5^(1,^) = g. Besides 




g{x,g) = 0 


( 3 ) 


it satisfies the nonlinear DE _ 

( 4 ) 

which is nothing else but a characteristic equation for (2). To employ this formalism, one has to 
give p{g). Usually, for this one uses renormalized perturbation theory. 

The foregoing can be considered as a “RG folklore” . For brevity, we gave it in the simplest 
massless version, which corresponds to the UV case, for the QFT model with one coupling constant. 


1 . 1.2 Group Functional Equation 

Less popular are the RG Functional Equations (FEs). The FE for the g in the UV case has the 
form 

9{x,g) = 9(j, 9(t,9)) ■ ( 5 ) 

This equation, which follows (see, e.g., the Chapter Renormalization Group in Ref.[l]) from finite 
Dyson renormalization transformations, represents a basement of the differential RG formulation. 
Popular DE (4) can be directly obtained from it by differentiating over x and then putting t = x. 
On the other hand, by differentiating (5) with respect to i at t = 1 we get partial DE (3). 

The FE (5) as well as similar FEs for propagators and vertex functions (see, below, eq.(19)) 
must be considered as the most compact and general formulation of the RG symmetry in QFT. 

However, in reality, group FEs, like (5) (and DEs (4) and (3) as well) do not contain any physics 
at all being just the reflection of the group composition law! Here, we mean the continuous group 
(that is, the Lie group of transformations) of operations changing the reference point p involved 
into the coupling constant g^ definition. Namely, we can regard the change of a reference coupling 
9^jl 9ji as an operation of the group element Tt 

Ttg^x = 9^ji,y/i ^ 9{^')9tj) 

with a real continuous positive numerical parameter t (= Ip^) • 

If we set x = r^, then the l.h.s. of (5) can be achieved from g by operation Tru while the r.h.s. 
may be identified as TrTtg. The content of eq.(5) is just the group composition law, 

Trt ~ Tt-Tj . 

Thus, the essence of the basic RG functional equation (5) is the necessary condition for trans- 
formations Tt to form a group. 

At the same time, it demonstrates that function g is invariant with respect to simultaneos 
transformation 

Rt : { x' = x/t , g' = g{t,g)} . (6) 

Invariance condition for an observable now can be written down as 

F(x, 5 ) = f(| , g{t,g)) . 

Usually, of interest are also functions <p{x,g) (like, e.g., propagator amplitudes in QFT) trans- 
forming as a linear representation of RG 

<l){x,g) Rt(l> = (f>(x',g') = z{t,g)(f>{x,g) . (7) 

Note also that the group FE for an observable, like matrix element, is of the form 

M({x}, 2/;5) = Af (1^1 ; {x} = xi,X 2 ,. ■ . ,Xk (8) 

which reflects an existence of several Q^-type arguments and implements its independence of renor- 
malization details corresponding to Eq.(l). 



28 


I) V Shirkiiv 


1.1.3 Abstract Formulation 

To make this point clearer, let us show that FE (5) <'aii formally hv oi)taimMl dirtnily from tlii‘ 
group composition law. Gtnierally, the inatheiuati(’al formulation tjf tin* H(i t ouisfoiiimlinn < .in tjn 
presented as a functional realization of tin* nientionc'd Li(‘ group. 

Consider the transformation T{1) of a certain al)stract st*t ol eltummis A/, into dc 
pending on a continuous real parameter I ( — oo < I < 'X-) such that for (*ach eleinnnt M we 
have 

Suppose that M can be projected onto th(» real axis, i.e., for (*very Aft there (*orri‘.spon<Ls a real 
number gi^. Then, this transformation can he writt.im in tin* analytic form 

T{l)g ^ ^ aH.g) . 

G being a continuous function of two argunn^nts satisfying the iKirmalization condit itjn ( /(U, (/) g 
which corresponds to the identity transformation T{i)) E. 

Transformations T{1) form a group if tliey satisfy tho eompc^sition law' 

T{1) xT(A) I A) (11) 


to which there corresponds th(^ fnnctional iMiuation for G : 

GiltdKgn^^ail f A,//) . 


( 10 ) 


As it follows from the bases of th(‘ Li(‘ group tlumry, it is sutikutmi to d<‘al with the intimti*.sima! 
transformation at A < 1 , i.e., with the DE 


0G{l,g) 


dl 


mi.f))} . 


Here the group generator is (hdined as 


, OGlt.q) 

m = at < = (). 


Performing a logarithmic change of variables 


( 11 ) 


/ = Inx, A = lilt , G{l,g) = ,9(a;,j7) , T(ln t.) = 1] {121 

we obtain (5) and (4) instead of (10) and (11). 


1.2 Definition of the Renorm-Group 

1.2.1 The RG transformation 

Generally, the RG can be defined as a contimious oiu'-parameter group of .specific t ran.sformatioii.s of 
a partial solution (or the solution characteristic) of a problem, a solut.ion tluit is fixed by a l ioimdat y 
condition. The RG transformation involves boundary comlitiou iiarameters and corresp<tnd.s to 
some change in the way of imposing this condition. 

For illustration, imagine an one-argument solution characteristic /(x) tluit hiis to l>e specific*! 
by the boundary condition /(xq) = /q. Formally, represent the giw'ii characteristic of a jrartial 
solution as a function of boundary parameters as well: f{x) = /(x.xd,/),}. (This stci* can In- 
considered as an embedding operation). The RG trairsformation then corn-sponds t*) a changeover 
of the way of parameterization, say from {xq, /o} to {xi , /, } for tlm same stdution. In other words, 
the X argument value, at which the boundary condition is given, does not ikhxI to lx- Xu, but w*- 
may choose another point Xi. Our solution / can be written in a form of a two-jugunu-nt function 

^This condition is not essential and can be modified — see, below, eqs. (16) and (17). 



Evolution of Renormalization Group 


29 


F{x/xo,fQ) with the property F(l,7) = 7. The equality F{xlxo,fo) = F{x/xi,fi) reflects the 
fact that under such a change of a boundary condition the form of function F itself is not modified 
(as, e.g., in the case of F^x.'j) = $(lnx + 7)). Noting that fi = F(xi/xo,/o) , we obtain 

Fi^,fo) = F{^/t,F{t,fQ)) ; ^ = xjxo , t = xi/xo . 

The group transformation here is { ^ /q -> /^ = F(t, fo) } . 

The renorm-group transformation for a given solution of some physical problem in the simplest 
case can be defined as 

a simultaneous one-parameter transformation of two variables, say x and g, by 

Rt : { X ^ x' = x/t , g g' = g{x, g) } , (6) 


the first being a scaling of a coordinate x and the second — a more complicated functional transfor- 
mation of the solution characteristics. Eq.(5) for the transformation function "g provides the group 
property Trt = TrTt of the transformation (6). Performing the logarithmic change of variables 
and an appropriate redefinition of a transformation function (12), we obtain eqs.(lO), (11) and 

R{1) : = 9 ^ q' = G{1,9) ] , (13) 

instead of (5), (4) and (6). One can refer to these equations as the multiplicative version (and 

previous equations in abstract formulation as to the additive one). They are just the RG equations 
and transformation for a massless QFT model with one coupling constant. In that case x = 
is the ratio of a 4-moment urn Q squared to a “normalization” momentum p squared and g, the 
coupling constant. 

Several generalizations of (6) and (13) will be considered below. 

1.2.2 Simple generalizations 
"Massive" Case 

For example in QFT, if we do not neglect the mass m of a particle, we have to insert an 
additional dimensionless argument into the invariant coupling g which now has to be considered 
as a function of three variables: x = j p?, y = m? jp? , and g. The presence of a new “mass” 
argument y modifies the group transformation 

Rt ■■ = = 9'=S{t,y,g)^ (14) 

and the functional equation 

Si^, y,9) = 9 , I ; 9ii, y, 9 )) ■ ( 15 ) 

Here, it is important that the new parameter y (which in physical nature must be close to the x 
variable as it scales similarly) enters also into the transformation law of g . 

If the considered QFT model, like QCD, contains several masses there will be several mass 
arguments 

y ^ {y} = yuy 2 ,---yn • 


Multi-coupling case 

A more complicated generalization corresponds to transition to the case with several coupling 
constants: g {g} = gi,. . .gk . Here, one has to introduce the “family” of effective couplings 


9 ^{9}, 9 i = 9 iix,yA 9 }) 


i = 1,2, .. .A: , 



30 


D Sliirkov 


satisfying the system of coupled functional e<iuation.s 

{. 9 }) = !i‘ (7 ' 7 ■' { { 9 }) } ) • (i<>) 

In the abstract formulation this system is a generalization of (5) and (lo) for tlie ('ase when 
every element A/^ of M can l)e described by k numerical {)aranii‘tc‘rs, i.(‘., l)y a ptniii |f/} in t lie 
ifc-dixnensioiial real parameter space. The R.G transformation now is 

Rt : { 7 . 9 -> 7 , {</} inU)} } . = 9.(h'/; { 9 }) • (17) 

1.3 Early history and the RG method 

1 . 3.1 Renol'malization (md reiiormalizatiim invanance 

As it is known, the regular formalism for <‘liminating the UV div(‘rg(^nc<‘S in QFT was develuped 
on the basis of covariant perturbation theory for tlu^ scattering 5 matrix in t lie lat<‘ dds. I his 
breakthrough is connected with the names of Tomonaga, Ft»ynman, S(diwing(‘r and soiii<‘ others. In 
particular, Dyson and Abdus Salam carried out th(‘ g<‘n(»ral analysis of tJi(‘ st.nulure of divergences 
in arbitrarily high orders of perturbation thcniry. N(‘V(Ul.liel(hss, a number of su!>tli» t|nestioiis 
concerning overlapping divergence's remainetl um-h'ar. 

An important contribution in this direction based on a thonmgh analysis of the matheiiiatical 
nature of UV divergences was made l)y Bogoliulxw. This was achievini on th(‘ basis of a branch of 
mathematics wliich was new at that tim(^ nanu'ly, the Soboltn*’ Scliwartz /./tcon/ diHinlmiumH. 
The point is, that propagators in local QFT aix* distributions (similar to thi* Dirac delta fum tion) 
and propagator products appearing in the co(#i(d(*ntH of the S matrix t'xpansion o'quire a su|)ple 
mentary definition in the case wlien their argunumts coincide ami lit* on th<* light cone. 

In the mid 5()s on the basis of this ai)proach Bogoliubov and his diH(‘i|)les <iev(»li iped a technique 
of supplementing the definition of products of singular Stiickelherg Feynman propagators j2j ami 
proved a theorem [3] on the finiteness and unitiueness (for r<*nonnalizablt' th<*oiie.s) of the S 
matrix elements in any order of perturbation theory. Th(* lUX'scTiptivi' part of tins thr'orein. the 
Bogoliubov R-apenxtion (see, e.g., chapter “Removal of divergencit's from tht* 5 matrix’* in the 
monograph [1]), still remains a practical means of obtaining finitt! and uni(|u<' results in thi* higher 
order perturbation calculation. 

The Bogoliubov algorithm works, essentially, as follows: 

- To remove the UV divergences of a one-loop diagr<irn, instetwi of intro<iut‘ing Si)me reguL’iriza' 
tion, e.g., the momentum cutoff, and handling (quasi) infinite countertenns, it suffict's to coin|>li*te 
the definition of a divergent Feynman integral by subtracting from it a cc'rtaiii polymunia! in the 
external momenta which in the simplest case is reducx'd to the first fc*w ti'nn.s of the 1 aylor series. 

- For multi-loop diagram (including one with overla|)ping diverg<*nci‘s) one should first suldraii 
all divergent subdiagrams and finish with subtracting the diagram as a whoh* in a hierais*hi<*al order 
regulated by the i?-operator. 

An attractive feature of this approach is that it is fn*e from any auxiliary noii|)liysic*aI at - 
tributes such as bare masses, bare coupling constants, and regularization paraineti»rs which »«* 
not involved in the computation within the Bogoliubov’s algorithm. 7lm lattf*r <'an fie regarded iis 
renormalization without regularization and counterterms. 

The uniqueness of computational results for the observable 5-inatrix (dements is (*nsur<*d liy 
cial conditions imposed on them. These conditions contain some d(^gnH> of frtM'dom (r(datcd it) dih 
ferent renormalization schemes and momentum scab's) that can be us('d to (istablish finit(' rt'lations 
between the Lagrangian parameters (masses, coupling constants) and corresponding nuK)rmaliz(‘d 
quantities. The fact that physical predictions are independent of arbitrarim^ss in tin* rc'uormaliza- 
tion conditions, that is, they are renorm-invariant^ constitutes the conceptual foundation of tin* 
renormalization group. 



Evolution of Renormalization Group 


31 


1 . 3.2 The discovery of the renormalization group 

In the 1952-1953 Stiickelberg and Peterman [5] discovered^ a group of infinitesimal transforma- 
tions related to finite arbitrariness arising in the 5-matrix elements upon elimination of the UV 
divergences. These authors introduced normalization group generated by Lie operators connected 
with renormalization of the coupling constant e. 

In the following year, on the basis of (infinite) Dyson’s renormalization transformations formu- 
lated in the regularized form, Gell-Mann and Low [6] derived functional equations for the QED 
propagators in the UV limit. The appendix to this article contains the general solution (obtained by 
T.D. Lee) of this functional equation for the renormalized transverse photon propagator amplitude 
ci(x,e^), written in two equivalent forms: 


e^d 

e^d{x,e^) = F (e^)) and lnrr= J ^ V^(e^) = 


d{e^d) 


dlnx 


(18) 


x-l 


A qualitative analysis of the behaviour of the quantum electromagnetic interaction at small dis- 
tances was carried out with the aid of (18). Two possibilities, namely, infinite and finite charge 
renormalizations were pointed out. 

However, paper [6] paid no attention to the group character of the analysis and the results 
obtained there. The authors missed a chance to establish a connection between their results and 
the standard perturbation theory and did not discuss the possibility that a ghost pole solution 
might exist. 

The final step was taken by Bogoliubov and the present author bs-55a,bs-55b,sh-55 — see also 
the survey [10] published in English in 1956. Using the group properties of finite Dyson trans- 
formations for the coupling constant, fields and Green functions, these authors derived functional 
group equations for the propagators and vertices in QED in the general case (that is, with the 
electron mass taken into account). For example, the equation for the transverse amplitude of the 
photon propagator and electron propagator amplitude were obtained in the form 

d{x,y;e^) = dit,y,e^) d e^d{t,y,e^)^ , 

s{x, y; e^) = s{t, y, e^) « y, (19) 

in which the dependence on the mass variable y = m^//F was present. 

As can be seen, the product e^d of electron charge squared and photon propagator amplitude 
enters in both the FEs. This product is invariant with respect to Dyson’s transformation. We 
called this function ™ invariant charge and introduced the term renormalization group. 

In the modern notation, the first equation is that for the invariant charge (now widely known 
as an effective or running coupling) a = ad{x^y]a = e^): 


a{x,y\a) = a 



( 20 ) 


Let us emphasize that, unlike the approach Ref. [6], in the latter case there is no relation with 
UV divergences and simplification due to the massless nature of the UV asymptotics. Here, the 
homogeneity of the transfer momentum scale is violated explicitly by mass m. Nevertheless, the 
vSymmetry (even though a bit more complex one) underlying the RG, as before, can be stated as 
an exact symmetry of the solutions of the QFT problem. This is what we mean when using the 
term Bogoliubov’s renormalization group or renorm-group for short. 

The differential group equations for a and for the electron propagator: 


dajx, y;cx) 
dhix 


^ ^ 1 d!(3!:, y , q)^ , 


ds{x,y]a) 

dlnx 


= 7 (|,a(x,i/;o!)) 5(x,y;a) , (21) 


'“^For a more detailed exposition of the RG early history see our review [4]. 



32 


1) V Shirkov 


with 


0iy,ot) 


da{^, in «) 


y(y,a) = 


ft) 


at ( = 1 


( 22 ) 


were first derived in [7] by differentiating the FEs. In this way, exifiicit realization of the group 
DEs mentioned in the paper [5] w<is obtained. These results established a conceptual link with the 
Stiickelberg — Peterman and Gell-Mann Low nsults. 


1.3.3 Creation of the RG method 

Another important achievement of paper [7] consisted in formulatiiig a simi)le algoi it hin for im- 
proving an approximate perturbative solution by (:oinl)ining it with tli(‘ Ih<‘ (‘(piations (fur detail, 
see below, Section 2). 

In our adjacent publication [8] this algorithm was (‘ffectively used to analysr* the I \ ami infrared 
(IR) behaviour in QED. Tlie one-loop and two-loop UV <isyin|)toti(*s 




a 


1 - ^ In X 


( 23 ) 


a%{x;a) 


a 


of the photon propagator <3us wcdl as the IR behavior 

s{x,y,a) w {x/y - 1 )-'W 2 ’' -in./'i. 


{211 

(2r.) 


of the electron propagator in the transverst* gauge wen* obtained. At. that, time, these expressions 
were already known only at the one loop level. It should he notf'd that in the mid fiOs the problem 
of the UV behaviour in local QFT was quite urgent. At that time, substantial progres.s in the 
analysis of QED at small distances was made by Landau an<l his collaborators [11], However, 
Landau’s approach did not provide a i)rescription for constructing suhsecjnent ai)pro.ximatuiii.s. 

The simple technique for obtjiining higher approximations was found only within the new 
renorm-group method. The one-loop UV asymptotu-s of QED propagators oblaincfi in our pa- 
per [8], eqs. (23) and (25), agreed precisely with tin; results of Landau’s group. 

Within the RG approach those results can he obtained in just a few lines of argumentation. 
To this end, e.g., the massless one-loop perturbation approximation should be substituted into the 
r.h.s. of the first equation in (22) to compute the generator 0(0, a) = r/>(a) ~ n^/firr followed iiy 
elementary integration of the first of eq8.(21). 

Moreover, starting from the next order perturbation (;xpr('.ssion ;<>) eontainitjg the 

a® In a: term, we arrived at the second renorm-group approximation (24) performing .summation 
of the Q:^(aln)" terms. This two-loop solution for the invariant coupling fir.st obtained in [8] 
contains the nontrivial log--of-log dependence which is now widely known f)f the “next-to-leadiml 
logs” approximation for the running coupling in quantum chromodynamics (QGD) see, below. 
eq.(47). 

Comparing solution (24) with (23), one can conclude that the two-loop correction is extremely 
essential just in the vicinity of the ghost pole singularity at Xi = exp(37r/a). 'I’liis <lemonsf rates 
that the RG method is a regular procedure, within which it is <iuite easy to estimate the rang*- of 
applicability of the results. 

Quite soon, this approach was formulated [9] for the case of QFT with two coupling coiusttmts, 
say, g and h, namely, for a model of pion-nucleon interactions with th<j pion.s .self-int.<-racti*>n. 'I'o 
the system of functional equations for two invariant couplings 

g‘^(x,y; g‘^,h) = f [j, j, fit,y,g\h), h{t,y, g'^h)") , 
h(x,y, g = j, g'^ {t,y, g^,ti), h(t,y, g^,h)^ 



Evolution of Renormalizsition Group 


33 


there corresponds a coupled system of nonlinear DEs - see, below, eqs.(52). It was analysed [12] 
in the one-loop appriximation to carry out the UV analysis of the renormalizable model of the 
pion-nucleon interaction. 

In a more general case of arbitrary covariant gauge the RG analysis in QED was carried out 
in [13]. Here, the point was that the charge renormalization is connected only with the transverse 
part of the photon propagator. Therefore, under nontransverse covariant (e.g., Feynman) gauge 
the Dyson transformation has a more complex form. This issue has been resolved by considering 
the gauge parameter as another coupling constant. 

In Refs. [7, 8, 9, 13] and [12] the RG was thus directly connected with practical computations 
of the UV and IR asymptotics. Since then this technique, known as the renormalization group 
method (RGM) and being summarized in the first edition of monograph [1], has become the sole 
means of asymptotic analysis in local QFT. 

1.4 RG in QED 

1 . 4.1 Effective Electron Charge 

An essential feature of quantum theory is the presence of virtual states and transitions. In QED, 
e.g., the process of virtual dissociation of a photon into an electron-positron pair and vice versa 
7 -H" + e"” can take place. The sequence of two such virtuale transitions represents the simplest 

contribution to the effect of vacuum polarization. 

The vacuum polarization processes lead to several specific phenomena and particularly to the 
notion of effective electron charge. To explain this, let us start with a classical analogy. 

Take a polarizable medium consisting of molecules that can be imagined as electric dipoles. 
Insert into it an external electric charge Q. Due to the attraction of opposite charges, the dipoles 
change their position so that the charge Q turns out to be partially screened. As a result, at a 
distance r from Q the electric potential will be smaller than the vacuum Coulomb law Q/r and 
can be presented in the form Q{r)lr where, generally, Q(r) < Q. The introduced quantity Q{r) is 
known as an effective charge. As r decreases, Q{r) increases and as r 0, Q{r) tends to Q. 

In QFT the vacuum, i.e., the interparticle space itself stands for the “polarizable medium”. 
Quantum-field vacuum is not physically empty. It is filled with vacuum fluctuations, i.e., with 
virtual particles. These “zero fluctuations” are a well-known effect of a ground state in quantum 
world. In QED, zero oscillations consist mainly of short-lived virtual (e*^, e“) pairs which play the 
role of tiny electric dipoles. 

Consider the process of measuring the electron charge with the help of some external electro- 
magnetic field. In the quantum case the probing photon can virtually dissociate into the (e'^,e’”) 
pair. This pair can be treated as a virtual dipole that produces partial screening of the measured 
charge. The simplest process involves two elementary electromagnetic interactions, its contribu- 
tion to an effective charge being proportional to the small number = a ct 1/137; and this 
contribution depends on the distance r ! In the region of r values much smaller than the Compton 
length of the electron = h/mc 3,9.10““^^an it depends on r logarithmically 

e -> e(r)=e|l-^ ln~-f...| (26) 

as was first discussed by Dirac [14] in the middle of the 30s. The e(r) value decreases as r grows. 
So, qualitatively, the QED effective charge behavior corresponds to a classical picture of screening. 

This dependence can be presented by a set of curves e(r). Each curve represents a possible 
behavior of the effective charge e(r) as obtained from the theory and considered without any 
reference to experiment {a = being unspecified numerically). 

The point is that in the classical analogue the value of an external charge Q inserted into the 
polarizable medium is known from the very beginning. In quantum physics .it is not the case, and 
a charge value can be measured at not very small distances. The result of measurement generally 
has to be specified by two quantities: the “distance of measurement” and the measured charge 
value Hence, to make the choice from the mentioned set of curves, one has to fix the point on 



34 


D Shirkin' 


the plane with the coordinates r = r., e{r) = Cj. Thus, for the chosen “physical" curve < (r,) = c,. 
Note, that the usual definition of the electron chjuge by a classical macroscopic (like Millikan) 
experiment corresponds here to very large distances r > r,., i.e., c - r(r ~ r,.) = 1/ \/i37. 

As it is well known, in relativistic rnicrophysics one usually uses the momentum rather than 
coordinate representation. Correspondingly, insteuui of <’(r) one <leals with the (piautity 
the Fourier transform of e(r) squared. It is a monotonically increasing function of its argument 
Q^, the 4-momentum transfer squared. Here and below, tlm biir denotes a function (distinct from 
a, a^, ai - its numerical values at some given value of the (/* argument). 'I'he correspondence 
condition with the classical electrodynamics now takes the form 0 ( 0 ) = 1/137. as in our scale the 
external long-range field corresponds to a photon with vanishing 4-moment um. liowevtn , as before, 
to fix one of possible curves on the plane (Q, n) one has to give a point Q \/Q~ t‘< 0 

and hence, for the selected (uirve «(/r^) = a^. 

The parameter ji sometimes is referred to as a .icalc. parumrUT. .\s is clear, it is just the mo- 
mentum magnitude for a photon used for the chargt? measurenn'iit. The (dfective coupling function 
a{Q^) describes the dependence of the electron clnirgc^ value on the nnsisuremeut conditions, In 
our days the logarithmic corrections to tin; Milliktin valu<> become (■.s.sential attd an‘ measured at 
big accelerators. 

The parameter p has no analogue in the QED Lagrangian that, rc'produces the classical electro- 
dynamical one. The phenomenon of its ari.sing in QFT was (>xagg(‘rated by tin* t<'nn “dimensional 
transmutation”. As it was shown, its apirearance is vtn-y natunil and is connected with the mea- 
surement procedure. 

This is a good place to recall the ideas by Niels Bohr formukited in th(' mithlh* of 3t)s [irij ami 
related to the complementarity principltu Tlu* point is that to sp(‘cify a <iuiintum system, it is 
necessary to fix its “macro,scopic surrounding”, i.e., to give the properties of nnicroscopic device's 
used in the measurement process. Just the.s<! d<‘vice.s are dt'seribed Iry iuiditiomil parameters, like 
p. However, this is not the end of the Bohr (i.e. scale) pararntder story. As can be shown, 
in the QFT this parameter existence leads to a new synimt'try lying in tin* fountiation of the 
renormalization group. 

1 . 4.2 RG transformations 

To do this, consider again the function having in mind that tht> physical .solution luw btsm 

chosen by the condition a{Q'^ = p^) = AssumtJ also for simitlicity that we deal with a niitssless 
QED, more precisely, with the approximation | Q |» m. This corresponds to tin* (leV-tmergy 
region or to distances r r^. Here the effective charge function A can tx' represented as a 
function of two dimensionless arguments Q'^/p^ and i.e., n{Q'^) -■ . 

Now, take into account that the couple of parameters used to ith-ntify the phy.sica! .Hoh»ti<»n 
may, generally, correspond to any scale p. Take two scales “1” and “2" with coor<linates pi.tn 
and P 2 ,a 2 , respectively. It is evident that o can b<i parameterized by any pair i - 1,2, .. . 

so that for arbitrary values the identity 

should hold. 

At the same time the second argument in the r.h.s., which by definition is eiiual to a at 

= pI, can be expressed in terms of a parameterized with the help of point ” 1" coordinates, i.e., 
Q !2 = a{Q‘^ = pI) = a{pllpl,ai) . Combining the last two relations and intnxiucing a notation 
Q^/p"^ = X, ai = a, pI/pI = t , we arrive at the FE 

a(z,a) = a( a(t,a)), (27) 

identical with eq.(5). 

Note, that the corresponding continuous one-parameter transformation i.s just the change 
(”!” ”2”) of the parameterization point 

Rt ■ {pi -t p2 = t/i/Ui, ai -t a-z = . 


(28) 



Evolution of Renormalization Group 


35 


Thus, we have shown that in the renormalized QED there exists invariance with respect to 
continuous transformations of the group type which involve two quantities and contain a functional 
dependence. This precisely corresponds to definition (6). 

As can be shown, in QED the effective coupling a is equal to a product of a and dimension- 
less function d{x, a) - the transverse photon propagator amplitude with due regard for vacuum 
polarization effects. Generally, in QFT models with one coupling constant the invariant coupling 
g{x,g) c^n be expressed as a product of g^ corresponding vertex function and the square root of 
propagator amplitudes of the fields participating in the interaction. Usually, this can be done on 
the basis of Dyson finite renormalization transformations. 

Thus, the RG invariance is nothing else but the invariance of a solution with respect to the 
way of its parameterization. For instance, in real QED, instead of using the “Millikan’s value” 
a(0) = 1/137 one may take the “CERN value” a(M|) 1/128, 9. 


2 Renormalization group method 

2.1 Basic idea 

Approximate solution of the physical problems with RG symmetry usually does not obey this 
symmetry which is lost in the course of approximation. This is essential when the solution under 
consideration posseses a singularity as far as the singularity structure commonly is destroyed by 
an approximation. 

In QFT, e.g., the usual way of calculation is based on the perturbation method, i.e., on power 
expansion in p. It is not difficult to see that finite sums of this expansion do not satisfy the 
functional group equations. As the simplest illustration, consider the effective coupling g in the 
UV region where the one-loop contribution has a logarithmic form 

5pTh (2;. 9) = 9 + 9^0 In X (29) 

with /3, a numerical coefficient. By substituting this expression into FE (5), after simple manipu- 
lation one has 

Discr[gp^] = ^ “ 

= [p + p^/31nx] ~ 4- g^l3\xix 4- 2g^P^lntln{x/t)] 7^ 0 

— error in the g^ order. This discrepancy can be liquidated by addition of the particular next 
order term to the r.h.s. of (29) 


9pt = S + 9^/3 In X + In^ x. 


This “improved” expression would yield the discrepancy of the g^ order which in its turn can 
be abolished by adding the g"^ In^ term to (29) and so on. 

Thus, we see, on the one hand, that the finite polynomials cannot satisfy the condition of 
renormalization invariance. On the other hand, we can conclude that the functional RG equation 
represents a tool for iterative reconstruction of renorm-invariant expression that has the form of 
infinite series. 

This example illustrates a rather general situation. As a rule, approximate solutions do not 
satisfy a group symmetry. Here, this happens in the UV limit as Inx 00 where the observed 
discrepancy becomes important. 

Another illustration is provided by the one-dimensional transfer problem (for detail, see our 
reviews in Refs. [16]). Take a half-space (I > 0) filled with a homogenious media. Let some given 
amount of particles (or radiation) be falling on the surface (at I = 0) from the empty half-space. 
The particle density n(i!,v) is a function of coordinate and of particle velocity v. It satisfies 
an integro-diflferential kinetic Boltzmann equation. In some cases one can neglect by the energy 
dependence of cross-sections. Here, the solution can be treated as a function of coordinate and 



36 


I) V Shirkin' 


direction of particle velocity ft = v/i;. In this, ^‘one-velocity", (‘asc* tln^ Himplv synnnriry of tin* 
RG type was found not for density, but for n integrattnl over dinHiioiis in forward hr‘inisi>!it‘re 

G{1) = / n{m)dn . 

The function G relates to amount of all particles moving inwards tin* nii'dia. A partial solutimt 
of this problem will depend on the boundary condition at I = 0. ( i)rrr‘spt»ndini;l\\ the .s(»luf i«»n 
characteristic G will be the function of two arguments (r (/,//) coordinait* / (distaiu'e fi oin ilio 
boundary) and total amount of ingoing particles g ^ 6’(i - 0) - (/((),/;). Just tliis fuiniion ( /( /,</ 1 
satisfies [17] the group FE (10) in the additive' form. 

It is rather simple to get an approximate bt'havior of this demsity (/(I,//), at small / 

= r/ + /C7'(0,/y), K I, (dUI 

which, being considered for large I values, also does not obtw tht‘ memtioned syrnmetry. 

On this basis one can sot the tiisk of ‘‘ren()rinali'/;iti()ndnvariant improvtunent " of peituiljativt* 
results. The key idea is to combine an ai)proxirnat(‘ solution with the grou|) ('(|uations. 'lire siini Jost 
and most convenient way for this “marriage" is the ust' of Lit' e<iuatioas, i.e., grtmp ditfer«'ntial 
equations. The renormalization group mt'thotl (ROM) as it was first formulatt'd in Hefs.[7, H, |{l| 
is essentially based on these group equations. 


2.2 Differential formulation 

The differential equations can be obtained from tlu' fundional oiu's in two different ways. Ihffei 
entiating eq.(15) by x and putting then t - x , oiu' obtains (compare with (•!)): 


dx 




!H 


t i 


1220} 


The nonlinear equation (22a) can be considered as ‘hmissive" generalization of <‘q.f’l), Gn the iif litu 
hand, one can differentiate eq.(15) with respect to t at the point t - 1, wlucli yields 


^ 2 2 


//U, //://) '• u , 


a linear partial differential equation (Pl)E). 

Analogous operations applied to tlu^ second of e(is.(19) lead to; 


0iiix,y,g) 

d Inx 


and 


where 


d d d ) 


liVid) 


dt 


<=^I 


on 


S22/<) 


(32) 


(33) 


is the so-called anomalous dimension of s. For a group invariant, like, e.g., matrix element M 
satisfying FE (8) this dimension is equal to zero. The corresponding FDE looks like 


“ • O'l) 

Equations (31), (32), and (34) express the indepemience on the ' t paramttter of the r.li.s. of 
the related functional group equations, i.e,, a mutual compensation of t de|)mulenc*c»s via three 



Evolution of Renormalization Group 


37 


(or more) arguments. This DEs can be called compensational equations to distinguish them from 
nonlinear eqs.( 21 ) which can be referred to as evolutional group equations. 

Stress that compensational as well as evolutional DEs taken together with normalization (i.e; 
boundary) conditions like §(l,p) = g , 5 ( 1 ,^) = 1 are equivalent to functional equations and 
to each other. At the same time, evolutional Lie equations turn out to be more convenient for 
practical construction of the solution, generators / 3, 7 being given. 

Let us comment also that the UV limit of compensational DEs like, e.g., 

coincides with the UV limit of specific nonclosed equation 

- 7 ( 5 )} s{x,9) = A5 (O' - S) 

obtained in the early 70s by Callan and Symansik. The r.h.s. of this equation contains the result 
of mass counter-term insertion into all internal lines of all diagrams for the function s under 
consideration. For this reason, in current literature compensational equations are often related to 
as the Callan-Symansik equations. However, these equations just in the form (31) and (32) were 
first obtained by Lev Ovsyannikov in 1956 while solving [18] functional RG equations. Therefore, 
we consider it justifiable to relate compensational DEs to the Ovsyannikov’s rather than to some 
other names. 

It is not difficult to formulate group DEs for a multi-coupling case by proper differentiation of 
FEs (16). For instance, the system of evolutional DEs looks like 

y. . (36) 


2.3 General solution 

General solution of the group FEs was obtained in the paper [18] by applying the theory of PDE 
to the compensational eqs. (31) and (32). Details of the derivation can be found in the Section 

48.3 of the third edition of the monograph [ 1 ]. The results obtained can be formulated as follows: 
To every solution of DE (31), there corresponds some function of two arguments F{y,g), re- 
versible with respect to its second argument and connected to g by the relation 

Fiy,g)=F(^^,9ix,y,g)). (37) 

The explicit form of g can be obtained now by reversing the r.h.s: 

9{x,y,g) = F^^ F{y,g) . 

To determine F it is sufficient to specify the generator f3{y^g). 

Note also that to get from the Ovsyannikov result (37) the solution in the UV limit, i.e., in a 
massless case at y = 0, one has to assume for F a specific limiting form 

F{y. 9) =y exp[/(y)] or =lny + f{g) as y 0 . 


f{g{x,g)} - f{g) = lnx ; g = f ^{Inx^ f{g)}. (38) 

Here, f\g) = 1/I3{g). This is equivalent to the Gell-Mann — Low — Lee solution (18). 

To every solution of eq.(19) for a function s, there corresponds some function S(y,y) related 
to s by 

si. 

s[x,y,9)- 


( 39 ) 



38 


D V Shirkov 


Let us give also the general solution of the same type for the system (16) for th»' A--r<nipliiigs case. 
It can be written down in terms of k arbitrary functions F., reversible simultam>ously with respect 
to last arguments, and defined from the system of k functional relations 




{//} = !lu 


Hk 


i,j := 1, ... k 


i-K)) 


All solutions (37) — (40) satisfy the usual normalization conditions. 

The transition to the massless limit in expre.ssions (39) (40) can ht' [>erfurme<i by a trick 

analogous to the given above. Then, e.g., 


s{x,g} 


■< •/)} 


Let us also formulate solution for th<; 2-couplinK case yi = i), g-x = h in the massless limit in 
the form analogous to (38) 


/i(5, h) = Ji{g, h) + In x, i = 1, 2 . 


(41) 


i,From the solutions presented it follows that imposing group .symmetry otie reduces l)y unity 
the number of independent arguments. 


2.4 RGM algorithm 

2.4.1 Technology of RG Method 

The idea of the approximate solution marriage [7, 8] with group .symmetry can be realised wit h help 
of group DEs. If we define group generators /f, 7 from .some approximate solulitms and then solve 
evolutional DEs, we obtain RG improved .solutittn.s that obey the group .symmi'try and t (iiiesiiond 
to the approximate solutions u.sed a.s an input. 

Now we can formulate an algorithm of improving an approximatt* .solutiiui. The proceilure is 
given by the following recipe which we illinstrate by a maHsle.ss om* coupling case (1) atid (r*): 
Assume some approximate solution g^^pr known. 

1. On the basis of eq.(22a) define the beta-function 


2. Integrate eq.(4), i.e., construct the function 


(42) 


3. Resolve the eq.(38) 

5rg( 2;>.9) = /“'{/(l7)+Inx} . (44) 

4. Then, the solution Prq, precisely sati.sfie8 the RG Hymnudry, i.e., it is an exact .solution of 
eq.(5) and corresponds to ^appr • 

For illustration, take as a p^ppr simplest perturbative expr<?,s.sion (2!)) for tlie invariant 
coupling. Here, the /3-function is 0{g) = , and the? inU'gration yiekks 


Jg fiig) A \9 9 ) 


The solution obtained 


V{x,g) 


9 

1 -f- gPi In X ’ 


(45) 


one one hand, exactly satisfies the RG symmetry and, on the other, being expanded in powers of 
g, correlates with the input (29). 



Evolution of Renormalization Group 


39 


2.4.2 RGM usage in QFT 

As it has been explained above in Section 2.1, the QFT perturbation expression of finite order does 
not obey the RG symmetry. On the other hand, in Section 2.4.1 it was shown that the one-loop 
UV approximation for g used as an input in eq.(22) for the construction of a group generator j3{g) 
yields expression (45) that obeys the group symmetry and exactly satisfies FE (5). 

Now, using the geometric progression (45) as a hint, let us represent the 2-loop perturbative 
approximation for g in the form 

9pi=Q - lnx + g^ [/3^ In^ x - Inx] -j- O(p^) , 

where /?i and ^2 mean the /? -function coefficients at the one-loop and two-loop level, respectively. 
If we substitute this expression into eq.(5) we obtain 

9 ft 9 ) - 9 )) = ln(x/i) In^ t . 

Meanwhile, we can use cis an input in Eq.(42). Now the step 1 yields 

0^^H9) = -019'^ -029^ 


and then (step 2) 




To make the last step, we 


— - r _ 1 . , 

J 7 ^ + by^ z'^ ^ 1 + bz 

have to start with the equation 



(46) 


(a;. 5)] = (ff) + In a: 

which is a transcendental one and has no simple explicit solution^. Due to this, one has to resolve 
this relation approximately. Take into account that the second, logarithmic, contribution to (z) 
in (46) is a small correction to the first one at bz <^1, Under this reservation we can substitute the 
one-loop RG expression (45) instead of prg^ into this correction and obtain the explicit expression 


1 + + p(^2//3i) I n [1 + gi3,l] ’ ' " 

This result (first obtained [8] in mid-50s) is interesting in several aspects. 

First, being expanded in g and gl powers, it produces an infinite series containing “leading”, 
i.e. ~ giglY, and “next-to-leading” ~ 9^{gl)'^ UV logarithmic contributions. Second, it contains 
a nontrivial analytic dependence 


ln(l + ^A0-ln(lng2) 

which is absent the in perturbation input. Third, being compared with eq.(45), it demonstrates 
algorithm of subsequent improving of accuracy, i.e., of RGM regularity. 

Now we can resume the RGM properties. The RGM is a regular procedure of combining 
dynamical information (taken from an approximate solution) with the RG symmetry. The essence 
of RGM is the following: 

1) The mathematical tool used in RGM is Lie differential equations. 

2) The key element of RGM is possibility of (approximate) determination of group generators from 
dynamics. 

3) The RGM works effectively in the case when a solution has a singular behaviour. It restores 
the structure of singularity compatible with RG symmetry. 


®It can be expressed in terms of a special, Lambert, W-function : 14^(2;) = z; see, e.g., [19], 



40 


D Shiikov 


3 RG in QFT 

This section is devoted, mainly, to general topics of RG applications in the QFT short distance 
asymptotic behavior. We discuss the specific features of UV analysis connei te<i with use i.f per 
turbation theory, in particular, reliability of results. 


3.1 UV analysis in general 

3. 1.1 One- coupling case 

General analysis of the UV asymptotic behavior for the one-coupling QFT model can he pei iV.i ined 
rather simply on the basis of the solution 


I 






Ino! 


Q'^ 


I -IS I 


of the massless RG equation (4) for an etfuctive coupling with g - g(hg) «uid ft, a rufurmt r point 
As follows from it, the asymptotics at lux oo corresponds to the divurgenen at tht* iippFi luinf 
of the Lh.s. integral. Depending on the feature of the /f funtiion the resultant VV tH*liavi<*ui i»f 
the invariant coupling g differs very much. 

Suppose that at very small g values the beta-function is positive. 'Then, vusvn lire |iossililt* 


a) Consider first the situation witii 


I 

corresponding to the case when the beta function hits a zero at Home finite point 

Here, the UV asymptotic value of effective coupling i.s finite -- f/. < x , wliich iclatcn 

to the renormalization of the. coupliny ronxtaut: Z = 

Using the terminology of DE.s qualitativt; th«>ory, oru* can .say that at // we have a r\' 
fixed point. 

If at 5 there is a first order zero 0{<j) ~ {>{<)» ~ y) , then e(i.{r)2H) gives 
p(tc.s) - - Gexp"*'"* = m Q’^ 


i.e., in the vicinity of a fixed point we have an asymptotic power regime. 

1^) H is (monotonically) increasing fw y oo but gentler than y'^, .ho that 


n dz 

4 m 


00 


then the effective coupling tends to infinity 


{h'ih) 


lim y{x,y) -¥ oo , 


which corresponds to infinite cotipling constant renormalization. Formally, thi.s i.s iHpiivaleiit to 

goo ““ 00 . 


c) At 


f 


Wr) 


L - InXix, < no , 


(52r' 


lim ^{g)/y^ > tronst 

X-*+(X> 


i 


that happens if 



Evolution of Renormalization Group 


41 


the theory has an inner contradiction, as far as 


.^(^00,5^) = 00 at Xoo < oo 


and the momentum region x > Xoo can not be described by the theory. We encounter here a ghost 
trouble^ as explained below in this Section. 

Up to now we have assumed that the generator 0{g) is positive. In the opposite case 
d) l3{g) = -b{g) < 0 one has to deal with the equation 



= Inx 


(52d) 


and study possible divergence of the integral involved at the lower limit. 

If this occurs at some finite value g = goo , the situation is quite analogous to the case b). The 
only difference is that now the effective coupling tends to its limiting value poo from above. 

As the most important case, we consider the possibility when the singularity lies at the origin 
/j(0) = 0 which happens in QCD. Then, g vanishes p(oo,p) = 0 in the UV limit which corresponds 
to the asymptotic freedom phenomenon. E.g., if we assume here that /3(p) = at p 0, 

then 


9{x,9) 


1 

Pi Inx 


as X oo . 


(51) 


3.1.2 Multi- coup ling cas e 

For the quantum field model with several coupling constants one has to consider the system of 
coupled functional (16) or differential equations. The last ones can be analyzed by the well-known 
methods of the qualitative theory of differential equations. 

Take the case with two coupling constants g and h. The system of evolutional differential 
equations is 

S = l3g{g,h), fL = /3h{g,h); f = dflde-, e = lnx. (52) 

According to (41), the general solution to this system is of the form 

Fig, h) = Fig, h)+i, ^g, h) = h) + i . 

where F and $ — two arbitrary reversible functions. 

As far as argument I = In x does not enter explicitly into the generators Pg and Ph^ it can 
formally be excluded by dividing one of the equations (52) by the other : 

i = ( 53 ) 

This equation can be analyzed on the two dimensional phase plane (p, h). 

First explicit example of such phase portrait has been obtained in mid-fifties by 1. Ginzburg 

[12] see also Section 51.4 in the third edition of the monograph [1]. The essential features are 

now singular points and singular solutions. Singular points correspond to pi =: 0 (or = oo). They 
can l)e of diffc^rent types: a stable fixed point that is known as attractor, an unstable fixed point 
and a saddle-type point. In tlie vicinity of the UV attractor one can have a power scaling behavior 
as in eq.(5()). Singular solution, separatrix, joins singular points and can also be stable or unstable. 
Generally, the unstable ones separate the parts of phase plane with different UV asymptotes that 
corres[)ond to UV stalfie separatrices. 



42 


D \ Shiikov 


3.2 Perturbative approach to the UV asymptote 
3.2.1 Structure of RG resxdts 

Consider a general situation witii the RG approacli to the W asymptotic i.ehavioi ha.sc(l on 
perturbation calculation input. In the one-coupling QFT ca.se. gnnip generators entering into DRs. 
can be written as 

^{g) = + . . . , -= t'l .</ t v-sr t ■ (•>■1) 

Generally, expansion coefficients depend on t,he m;iss vtu iahle 

fHy,g) = " X! '*'(('/ )'/ . l-'i"') 

t>> ‘ 

Note that if g is just the 5 matrix ('xpansion parameter (that can l>e erpial to the c.uipling 
constant or to its square) then usually th(‘ first term in exi)ansitin f<vr fi is (piadratic ami foi i/ 
linear, as it is explicitly indicated above. 

Substituting (54) into (48), and re-expanding the rtitin we olitain aftiu integration 

l/g - 1/p - ^ ln(///(/) - ki(g - g) 4- (>(.(/^.f^) ‘h In x , (afi) 

Ih 

\ns{x,g) = (V'i/A)ln{.<'/(^N.'7)/.'/} f ‘'iig .'/) 1 <>(;;-) , 157) 

ba = fhifh - Uhffiif, <'2 -= [h/lhUta/th 

As follows, the solutiona fj and .s dcpcmd on two argumcnUs // ami f/ln.r . IJy cxpandiitg tfmin 
in powers of g we get 

g(x,g) =gfi(glnx)+ff'^f 2 (glnx) + ... , In.‘f(x,//) = (Pi(.7ln.tr) 4 //>p:-(.'/ln J') ' 

where fj and ipi have a simple form. For example, fi(z) = (1 - /iiz) (:) ~ fj(:} - In /t(r) . 
Comparing expressions obtained with usual perturbative expansions 

gpt.(^,g) = g + g'^f^i inx 4- (/[/if lux 4 ih in j’] t (H;/*). 

«p( {x, (?) = 1 + gii>i lit ti; + g^[{/h Vd /2) in’"' x + il >2 In x] 4 ()((/■*) , 

used as an input to obtain our starting generators, one can se(‘ the (luaiittitivr* (dfect of the RCJM 
using. In the case considered, it changes the region of applicability of tin* pert nrhiii ion inetfiod 
limited by the condition ^tlnx -C 1 to a more larger region defined by t wo rehitions 

g<l, g(x,g)<Kl, (59) 

the second of which is defining. 


3.2.2 The ghost-pole trouble 

Turn now to the one-loop RG approximation for the effective coujrling g, ccm.sidere<i in tfse UV', 
i.e., massless limit. According to (45) and (23), it has the form 


.9(1) (a:,//) = 


g 

1 - /Igglux 


Let the numerical coefficient be positive. Such is the case in QED where /j = l/lJir and g 
stands for the expansion parameter a = e^. This expression obviously has a pole singularity at 


X = X* = exp{l/(}ig) = exp(37r/o:) . 



Evolution of Renormalization Group 


43 


As far as the QED effective coupling is proportional to the (transverse part of) photon prop- 
agator, this pole, generally, describes some bound state of a system with the photon quantum 
numbers. However, a pole related to a physical bound state must have positive residue while the 
l.h.s. of eq.(23) has a negative one. 

This means that it corresponds not to a physical but rather to some unphysical, so-called 
ghost, state. The presence of a ghost singularity can be treated as a signal of inconsistency of 
a theory. Such claims have been made [20] in the mid 50s when the ghost-pole trouble was first 
discovered [21] just before the birth of the RGM. 

The RG method proved to be very effective for a general discussion of the ghost-pole issue. The 
first question that must be answered here is the stability of indication of the ghost-pole existence 
with respect to the multi-loop corrections. 

Note that in a perturbation calculation, the yd-function depends on the adopted renormalization 
procedure; at the massless case, starting with the 3-loop level the coefficients of the perturbation 
series (54) depend on the renormalization scheme (RS) used. In QED, the 3-loop /? function in 
MOM (i.e., momentum subtraction) scheme is 


^MOM _ 


3tt 47r^ Stt^ 




(60) 


The numerical value of the last parenthesis is about 0.4. Neglecting it for the moment, we 
start our discussion with the two-loop approximation for the y3 function. According to eq.(24), the 

2- loop iterative RG solution is 

- a 

This solution has an error of an order of and is interesting from several points of view. As 
it has been mentioned before, its a expansion besides leading logs contains an infinite number of 
next-to-leading terms a^(oZ)"\ the first of which has been used as an input for construction of the 
/3 function. Second, in the vicinity of tlie ghost pole of the one-loop RG solution at li = Stt/d:, the 
two-“loop a( 2 ) solution diffems from d:(i) considerably. Hence, an infinite sum of the next-to-leading 
logarithmic contributions in the region al ^ I becomes important. 

It is not trivial because for an each order of the perturbation input the next-to-leading term 
is negligible comparing with the leading one of the previous order (the ratio being of an order of 
och/fti = 3o!/47r 2.10'"^). It can be seen with the help of (56) that the allowing for the last, 

3- loop, term in (60) also becomes essential for the a 1 case. 

This means that the problem of existence for the ghost pole in QED cannot be solved by taking 
into tlie account of next-to-leading and so on logs. Moreover, one can argue [22] on general RG 
ground that it is impossible to make any qualitative statement about the UV asymptote for the 
P{g) > g case basing on RG-irnproved perturbation calculations. Our next example illustrates this 
thesis. 


3.2.3 Scalar quartic model 

For the nonlinear scalar field with the quartic (self)interaction Lagrangian 



important progress has been achieved in 80s in the higher perturbation orders calculation. The /? 
function was calculated in the MS-scheme up to the 5-loop level [23] 


pMS ^ 3^2 _ ^^3 ^ I6.275ff^ - 135.8p5 + 1437^® . 

We see from this expression that, in conrast with the QED case, due to its alternate-sign 
structure, there is no stability here even on a qualitative level. The odd-order approximations have a 



44 


D Shirkin' 


ghost-pole type behavior, whereas the even ones yield the fixe<l point (finite rharge renonnali/.ation ) 
case. Note also that, as can be shown [24], the upper boundary of fin* 10'/ ia)nfid<‘n<<- region 
corresponds to g values close to 0.1 . 

To comprehend the “loop dependence” of this bouiulary it. is useful to r<‘pr(‘Siuit oxpK^ssum 
in a slightly different form 


oMS 


(g) 


3 2 

7^9 


1 


9 


0.529 \().3(]3 




J 

0.222 


9 

O.ISO 


(01) 


It is clear now, that a boundary of the confitleace region dimiuLslies with rising tlie ordfi of 
a loop approximation. The expression ( 61 ) looks like a beginning of a power asym|)totic seiif*s uf 
the Poincare type. Indeed, if we represent the /j -function in a s(‘ries f>xpansion form (o i) t ;is 
it is can be shown, the coefficients fin > 1 behav(‘ like ~ //!. 

The method of determining asymptotic estimates of tin* p(‘riurl)atiou (‘xpansion coelfn irufs of 
the Green functions uses a representation in th(‘ form of a functional (i.e. |)a 1 h) inti‘gral. Hus 
integral written down for the. mentioned expansion coeffichmt cim !h‘ (’al(’\ilat<Hl by the steepest 
descent method in the function space. To the saddl(» point, th(*r(‘ eorn‘spoa(ls an ‘insiantt>n" ty|>r 
Euclidean classical solution with a finite action. 

In this manner, an asymptotic expression was obtairn^d [25] f<u' the e(H‘fiici(*nts of tin* J funeliun 
expansion. It has the form 

/3n - ^^2 ^ Oin '^)) (n 4 m) . (ii2) 


The factorial growth of coefficients indicates that this is a powtu- asymptotic .seri<‘s with yrra ladius 
of convergence that cannot be summed in the usual manner. W(‘ can obtain the infoniiation alnjut 
the singularity structure at the origin {g = 0) by using some* sp(*cial proredurr*s. One (ff tliiuii is 
the Borel summation method. 

Here, we give short exposition of the results on the att(un})t of tin* summation of the .stu ies of 
(61) type made in [24] (see also the review [26]). 

Authors of the Ref. [24] used as an input the iMxmciion 4-loop expression in the syiiiirniric 
MOM-scheme 

/^rf "^(.9) = - f 9'' + 19.3// ~ 146/ . (63) 

This alternate-sign asymptotic series can be summed by tin* Bort*! nn'tinni. The idea is to rep- 
resent the sum in a form of a Laplace transform int(!gral. It is not difiicult to scr* tiiat the transition 
to the Laplace image just “kills” the factorial factor n!. For tin; modilietl Bor**! transfonnati<tn 


«S) = ^ ;((x) 

perturbation series can be written down as 


(64) 


Blx] = E 

n 


Pn , 

rdn® * 


It has a nonzero circle of convergence and can be summed within tin* circle. However, its the 
integration domain in (64) goes outside the convergency rt^gion, wo must nuike an tmalytic contin- 
uation for the function B{x). It can be done by a conformal transformation of the j’-platn* into 
the m-plane to map the domain of integration [0, oo] into the interior of tin; unit di.sk and the 
cut [— 00 ,- 1 ] into the boundary of the disk. One can choose this transformation in such a way 
that it correctly reproduces the singularity on the cut. The result of conformal transformation 
a: -)■ w(x) = (VT+x- 1) • (y/TTx + 1)“’ “looks quite well" : 


B{x) = 128 " 0.127m^ -(- 0.084iir’) 



Evolution of Renormalization Group 


45 


Then, by transformation reverse to (64) one can reconstruct beta function 0{g) which is nonan- 
alytic in the g variable with essential singularity at p = 0. Graphs of the function P{g) obtained by 
the Borel summation with allowance for the 1-, 2-, 3- and 4-loop approximations look very similar 
one to other. 

They all lie now in a narrow parabolic ray slightly below the original one-loop parabola and 
within the limit of 10% accuracy enable to advance into the region g 50, This means that the 
summation procedure adopted enlarges the confidence interval in several hundred times! Besides 
this it gives the qualitative stability of results. All they are now in favour of a ghost-type UV 
asymptote. 

Nevertheless, these results can be considered only as a support but not the proof of the 0'^- 
rnodel inconsistency. The weak point here is that starting with (64) we have assumed the definite 
analyticity properties of l3{g) in the whole complex ^-plane. 

3.3 Mass-dependent analytic solution 

A general method of an approximate solution to the massive (i.e., mass-dependent) RG equations 
was developed in Ref. [27]. Analytic expressions of a high level of accuracy for an effective coupling 
and a one-argument function were obtained up to 3- and 4-loop order [28]. 

For example, the two-loop massive RG-solution for the invariant coupling 

" 1 + (Q2, m2) + asA 2 /Ai) In (1 + as A, (...)) 

at small values corresponds to perturbation expansion 

= «« - [A\ - A2{Q^,m'^)] + ... . 

At the same time, it smoothly interpolates between two massless limits (with Ai cri /J^lnQ^ *f q) 
at 7ri^ and > ni^ dc^scribed by an equation analogous to (47). In the latter case it can 

be represcmtcMl in thc^ form usual for tlie QCD practice: 

ri; ‘ ^ l^i i 

Solution (65) (lemonstr«ites, in particular, that the threshold crossing generally changes the 
subtraction scheme* [29]. 

The investigation [27, 28] was prompted by the problem of taking explicitly into account of 
heavy qtiark masses in QCD. How<‘ver, tlie. results obtained are important from a more general 
I)oint of view for a discussion of th(‘ scheme dependence problem in QFT. The method used 
could also 1)(* of interest for IlG api>lications in other fields within the situation with disturbed 
homogeneity, such as, e.g., intermediate asymptotics in hydrodynamics, finite-size scaling in critical 
{)lien()m(*na and the (!xcluded volume problem in polymer theory. 

In {)aper [30], this method was applied to the evolution of effective gauge couplings in Standard 
Model (SM). Ilerc^, a new analytic solution of a coupled system of three mass-dependent two-loop 
RG (Hiufitions for three SM gauge couplings was obtained. 

For this gofil, one has to start with a perturbative input for the SM couplings 

ai{Q'\m^) ^ (Xi - (x\Ai{Q,m,p) + ai-‘4i(G) - i = 1,2,3. (66) 

7 

where /I* and Aij are one- and two-loop mass- and /^-dependent contributions of appropriate Feyn- 
man diagrams. In the framework of a massive renorrn-group formalism [7, 8, 10] the corresponding 
Lie <K|uations look like 

Ai{Q) + '^^cxj{Q)Aij{Q) 

7 




( 67 ) 



46 


D V Shirkov 


with A = dA/dt, i = InQ'^IA^ ■ Note that in the UV limit A^Q) - ; A,j{Q) ~ ii,jf we 

arrive at the system 

di{i) = - 

that is commonly used - see, e.g., Refe.[31] - for the discussion of data extrapolation across the 
gauge desert and possibility of Grand Unification. 

The latter system can be solved iteratively in the form 

1 = 1 + /^.^ + V ln[l + ; a. =: a,(/r) • m 

ai[i) Oi “ Pj 

Here, we present a generalization of this solution for the massivf* ca.se, that is convenient 
for taking into account of threshold effects and <li.scu.ssing, in particular, the i.ssue of tin* (iiami 
Unification consistency check. 

Using the method of the paper [32], one can obtain explicit iter;itiv{‘ solution t(» the .system 
(67). Here, as in the massless case, one first solves the one-loop approximation to (fi?) to get ’• 

- [1/oi -t- ‘ . 

Inserting then this explicit expression into the .second factor in the r.h.s. of (67) iind peiforming an 
approximate integration of some integral — for detail st'e paprtr [32] we arrive at tin* expii'ssiou 


+ ^P.jO'j(f') nf{f) 


ai{Q'^,rn^) a 


i + A,iQ) 4- x; 


Aij(Q) 

(Q) 


ln[l t OjAj(Q)l . 


im 


quite analogous to (65). 

The remarkable feature of this solution is that it deptmds explicitly only on m;i.s.s-dependent 
perturbation coefficients Ai(Q), Aij(Q) and, Ixung (ixpanded in powms of coupling constants, 
exactly corresponds to the perturbative ini)ut (06). On the otlnu- htuid, in the ma.ssless limit it 
goes to solution (68). 

The accuracy of the last approximate (ixpression can be estimatf'd by the met hod used in paper 
[34]. Generally, it corresponds to the accuracy of thr(!e loop <>xpre.ssion ( ~ o'' In ) for the effet'tive 
coupling in the one-coupling case that is quite sufficient for current pactic<>. 


3.4 Some important results 

In the early 70s, S. Weinberg [35] proposed the notion of a riimifiig iuHtis of a fermion. If comsideretl 
from the viewpoint of paper [13], this idea can be* formulated ius follows: 

any parameter of the Lagrangian can be treated afi a (gmeraJIxed) coupling ronstmt, mid its 
effective counterpart should be included into the renorm-group formulism. 

New possibilities for applying the RG method ween* dise’ovt'reed whe*n the* techniejue of ope-iator 
expansion at short distances (on the light cone) appeared [36]. I’he* plau.sihility of tlu.s ajjproae-h 
stems from the fact that the RG transformation, regarded as a Dyson transformation of the* renor- 
malized vertex function, involves the simultaneous scaling of all its invariant arguments normally, 
the squares of the momenta. Meanwhile, for the physical amplitude, some of them are fixes! on 
a mass shell. The expansion on the light cone, .so to say, “separatees the* nrgtame'nts", .as a result 
of which it becomes possible to study the physical UV asymptotic be’haviour by me'atis of the* 
expansion coefficients (when some momenta being fixed on mass shell). As an important e*xample*, 
we can mention the evolution equations for moments of QCD structure functions [37]. 

The revealing of an asymptotic freedom phenomenon can be considered as the most important 
result obtained in particle physics by the RG technique. 

^This exact solution of an one-loop massive RG equation was first obtained in Ref.[33]. 

'See eqs,(13) and (16) in Ref.[34l. 



Evolution of Renormalization Group 


47 


Historically, this discovery was made [38] in the framework of the SU(3) non-Abelian Yang- 
Mills model in the early 70s. Since that time this model for the eight-component 4-vector field 
B^{x) was adopted as a basic ingredient for the QFT description of matter on the parton level. 

The key point is that self-interaction of this non-abelian gluonic quantum field due to domi- 
nance of its unphysical components gives negative contribution to the beta function perturbation 
expansion. For the two-loop (scheme-independent) case 

^QCD(a) = - /32a® 


with positive /?i ,2 for a number n/ of quark flavours small enough. 
Correspondingly, the one-loop renorm-group expression 




Q!» 

1 H- OsPi Inx ’ 


for the QCD effective coupling exhibits a remarkable UV asymptotic behaviour thanks to (5i being 
positive. This expression implies, in contrast to eq.(23), that the effective QCD coupling decreases 
as X Q‘^ increases and tends to zero in the UV limit. This feature discovered in the early 70s, 
precisely corresponded to the parton physical picture of the hadronic structure. 

One more interesting application of the RG method in the multicoupling case, ascending to 
50s [12], refers to special solutions, the so-called separatrixes in a phase space of several invariant 
couplings. These solutions relate effective couplings and represent scale-invariant trajectories, like, 
e.g., gi = gi{gi) in the phase space which are straight lines in the one-loop case. 

Some of them that are “attractive” (or stable) in the UV limit, are related to symmetries that 
reveal themselves in the high-energy domain. It was conjectured that these trajectories may be 
related to hidden syrmnetries of a Lagrangian and even could serve as a tool to find them. On 
this basis the method was (ieveloi)ed [39] for finding out these symmetries. It was shown that in 
the phase space of invariant couplings the internal symmetry corresponds to a singular solution 
that remains a straight-lirn; when higher order corrections are taken into account. Such solutions 
corresporxding to sup(‘rsymmetry w(U’e derived for some combinations of gauge, Yukawa and quartic 
interactions. 

Generally, these singular solutions ()l)ey the relations 


d()i _ (Uji dgi 


which are known Hince Ziininerrnann’s paper [40] as the reduction equations. In the 80s they were 
used [41] (see also r(‘vi<‘w paper [42] and references therein) in the UV analyzis of asymptotically 
free models. .Iu.st for the.s(> cases the one-loop reduction relations are adequate to physics. 

Quite recently some other application of this technique was obtained in supersymmetric gen- 
eralizations of Grand Unification scenario in the Standard Model. It was shown [43, 44] that it is 
possible to achieve complete UV fmiteness of a theory if Yukawa couplings are related to the gauge 
ones in a way corresponding to these special solutions, that is, to reduction relations. 

The ina8.s-d<ipendent technique described in Section 3.3 was successfully used for the develop- 
ment of the Dhar-Gupta approach [45, 46] that led to finite perturbative predictions for a physical 
(piantity whitdi is free of renormalization scheme ambiguities. In paper [47], this approach was 
reformulated for the mass-dependent case with several coupling constants. 

One more recent QFT development relevant to the renorm-group is the “Analytic approach" 
to {lerturbative QCD (pQCD). It is ba.sed upon the procedure of Invariant Analyticization [48, 49] 
jiscmiding to the end of 50 h. 

The approach consist.s in the combining of two ideas: the RG summation of UV logs with ana- 
lyticity in the variable, imposed by spectral representation of the Kallen-Lehmann type which 
implements general properties of the local QFT including the Bogoliubov condition of microscopic 
causality. This combination was first devised [50] to get rid of the ghost pole in QED about forty 
years ago. 



48 


D V Sliirkov 


Here, thepQCD invariant coupling a,{Q'-) is transformed into an “analytic coupling” 

A(x) which, by construction, is free of ghost singularities due to incoriKiratiug some iioniuTturha- 
tive structures. 

This analytic coupling A{x) has no unphysicid singularities in the complex C,)‘-plaue; its c<.n 
ventional perturbative expansion precisely coincides with the usual perturbation one for , 

it has no extra parameters; it obeys a universal IR limiting value A{i)) indepcmient of 

the scale parameter A; it turns out to be remarfcibly stable [49] in the IR domain with respect to 
higher-loop corrections and, in turn, to the scheme df-pendence. 

Meanwhile, the “aiialytidzed” perturbation expansion [51] for an obs(>rviiblc /•'. in contrast to 
the usual case, may contain specific functions inst(‘ad of |)owers (.-tfT))'' . In other words 

the pertubation series for F{x), due to mialyticity imperative, can change its form in tin* IR region 
[19] turning into an asymptotic expansion a la Erdf'lyi over ;i nonpower w't } . 

4 RG expansion 

In 70s and 80s R.GM wtis applied to (besides QFT) critical phenomena: polymers, turbulence, non 
coherent radiation transfer, dynamical chaos, and so on. Simpler and less sohj)isticaled motivjition 
in critical phenomena (than in QFT) makes this ’’explosion” of RG ap])licatiou.s po.ssible. 

4.1 Critical phenomena 

4 . 1.1 Spin lattices 

The so called renormalization group in critical iihenonuuia in hasinl on the KadrinolF Wilson inu 
cedure [52, 53] referred to as “decimation” or “blocking”. Initially, it euiuTged (rmn tli(* proldem of 
spin lattice. Imagine a regular (two- or three- dimentional) lattice consisting of d 2»3 
with an ‘elementary step’ a between th(*m. Suppose, that at every sit<* a spin ve<’tor ff is sitting 
The Hamiltonian describing the spin interaction of nearest neighhourH 

H ~ kY^rri-am (7Uj 

i 

contains k, the coupling constant. The statistical sum is obtained from the partition function. 

5 =< exp{-H/6) >aver. • 

To realize the blocking or dcicimation, on<‘ has to perform the “spin averaging" over bloi k 
consisting of elementary sites. This i.s a very essential step as far as it diminishes tin* degree 
of freedom number (from JV'' to (N/n)''). It destroys the small-range properties of the system 
under consideration, in the averaging course some information being lost. However, the long-r<mge 
physics (like correlation length essential for phase transition) is not affected by it, find we gain 
simplification of our problem. 

After this procedure, new effective spirus S arise in .sites of a n<*w effective lattice with a ste{> 
na. We obtain also a new effective Hamiltonitui, with new effective coupling A.',, that has to !»■ 
defined in the averaging process as a function of k and n 

ifeff = Kn S/ ■ S/i i + AH , 

/ 

where Aif contains quartic and higher term.s; AH =• E • £ E • £ + • • • . 

For the IR (long-distance) properties, AH is unessential. Hena?, we ciin conclude that the .spin 
averaging leads to an approximate transformation, 

i I 

or, taking into account the “elementary step” change, to 

KWn : {a n a, k H„} . 


( 71 ) 



Evolution of Renormalization Group 


49 


The latter is the Kadanoff-Wilson transformation. 

In general, the “new” coupling constant is a function of the “old” one and of the deci- 
mation index n. It is convenient to write it down in the form Kn = K(l/n,K). Then, the KW 
transformation can be formulated as follows: 


KW(n) : la -^na, k K„ 


Kl -,k 
. n 


These transformations obey the group composition law 

KW{n) • KW{m) = KW{nm) 


if 


Kix,k)=Ki'^,Kit,k)) 



This is just the RG symmetry. 

We observe the following points: 

• The RG symmetry is approximate (due to neglecting by AH). 

• The transformations jK'W’(n) are discrete. 


(72) 


( 73 ) 


• There exist no reverse transformation to KW (n) . 

Hence, the ‘Kadanoff-Wilson renormalization group‘ is an approximate and discrete semi-group. 
For a long distance (IR limit) physics, however, AH is irrelevant, A(l/n) is close to continuum 
and it is possible to use differential Lie equations. 

In application of these transformations to critical phenomena the notion of a fixed point is 
important. As it was explained in Section 3.1, it is usually associated with power-type asymptotic 
behavior. Note here that, contrary to the QFT case considered in Section 3.1, in phase transition 
physics we deal with the IR stable point. 


4 . 1.2 Polymer theory 

In the polymer physics one considers statistical properties of polymer macromolecules which can 
be imagined as a very long chain of identical elements. The number of elements N could be as big 
as 10^, the macromolecular size reaching several hundred Angstroms. 

Such a big molecular chain forms a specific pattern resembling the pattern of a random walk. 
The central problem of the polymer theory is very close to that of a random walk and can be 
formulated as follows. 

For a very long chain of N “steps” (the size of each step = a) one has to find the “chain size” 
Rpf as the distance between the “start” and the “finish” points, the distribution function of angles 
(pi between neighboring elements being given. 

The function f{(p) is defined by the forces between adjacent elements depending on some ex- 
ternal factors like temperature T. The essential feature of a polymer chain is the impossibility of 
a self-intersection. This is known as an excluded volume effect in the random walk problem. In 
reality, polymer molecules are swimming in a solvent and form globulars. 

For large N values the molecular size follows the power Fleury law ^ with i/, the 
Fleury index. When N is given, Rjs is a functional of f{(p) which depends on external conditions 
(e.g., temperature T, properties of solvent, etc . ). If T increases, Rjv increases and at some moment 
globulars touch one another. This is the polymerization process very similar to a phase transition 
plienomenon. 

The Kadanov- Wilson RG (KWRG) blocking ideology has been used in polymer physics by De 
Gennes [54]. The key idea is a grouping of n chain subsequent elements into a new “elementary 
block” . This grouping operation is very close to Kadanoff ’s blocking. It leads to the transformation 


{1 — > n *, a An} 



50 


D V Shirkov 


which is analogous to one for spin lattice dedination. This transfonnatioii must he sj 
a direct calculation which gives the explicit form of .-I,, = Here we have a dis< 

group. Then, by using the KWRG technique, unt- finds th(> fixe-d p.)int. obtains th<‘ Fl< 

law and can calculate its index 

Generally, the excluded volume effect yields some romi)li('ations. Howt^ver, inside the 
framework it can be treated rather simply [55] l>y iutrodming an additional argument 
finite length L in transfer problem and particle mass jn in Q 1 1* . 

Besides polymers, the KWRG approach has bc‘en used in sanu» fi^dds of pliysics, like p< 
noncoherent radiation transfer [56], dynami(‘al chaos [57] and some otinus. 

Meanwhile, the original QFT RG approach proliferated int(y the tln'oi v of t inbulem 


H'cifiiMi In* 
Tele semi- 
siry powi*r 


^ QFd’ Rc; 
similar to 

•rcolatioii. 


4 . 1.3 Turbulence 

To formulate the turbulence problem on tlie ^'RC* language” one lias to perform the foilown^', ,sle|>s 
[59, 60, 61]: 

1. Introduce the generating functional for correlation functions. 

2. Write the path integral representation for this functional. 

3. By changing the function integration variabk' find the (H|ui valence of the* classical statistical 
system to some quantum field theory mod(d. 

4. Construct the system of Schwinger Dyson ecpiations for t.his e(|uival<*ijt Q1‘‘T. 

5. Perform the finite renormalization prcx'eduriL 

6. Derive the RG equations. 


4.2 Paths of RG expansion 

RG is expanded in diverse fields of physics in two (iifhu'ent ways: 

• by direct analogy with the Kadanov-Wilson construction (avtTaging (»vi*r some set t>f degrees 
of freedom) in polymers, non-coluu-ent transfin* and percolati<>n, i.t*., constructing a s«*t t»f 
models for a given physical probltun. 

• search for an exact RG symiiKJtry by proof of the ecjnivakuice with a QFT model; e,g., in 
turbulence (Refs. [59, 61]), plasma turbulence [62] and soim* otlxu's. 

To the question Are there different renormalization groups? th(‘ answer is positive: 

1. In QFT and some simple macroscopic examph‘s (liki*, on(» dimiuitionai transfer problemi , 
RG symmetry is an exact symmetry of the solution formulated in its natural var tallies. 

2. In turbulence, continuous spin-field models and some otlu'rs, it is a symuairy uf an eqiiivalimt 
QFT model. 

3. In polymers, percolation, etc. , (with KW blocking), the l\il transformation is a fmris/or 
mation between different auxiliary models (specially constnictisl for this purpose) of a given 
system. 

As we have shown, there is no essential difference in the mathemaiit^a! formalism. There exists, 
however, a profound difference in physics: 

— In cases 1 and 2 (as well as in some macroscopic exampk^s), tlx* RG is an (‘xad Hyiiiiiiidry 
of a solution. 

— In the Kadanov-Wilson -type problem (spin lattice, polymers, etc. ), one lias to esmstrurt a 
set M of models Mi. The KWRG transformation 

R{n)Mi = Mni , with integer n 


is acting inside a set of models. 


(74) 



Evolution of Renormalization Group 


51 


4.3 Two faces of RG in QFT 

As it was explained in Section 1.4, the vacuum, i.e., the interparticle space, contain vacuum 
fluctuations. Due to them, the charge of a particle is screened. In accordance with Dirac eq.(26), 
in momentum space the Q‘^ dependence of an electron charge can be presented 

e(Q^) = e{l + ^ HQ^rl) + ...}; = 1/137. (75) 

in terms of the classical electron charge and of electron Compton lenght. 

The first idea of an additional symmetry in this problem was born by Stiieckelberg and Peter- 
man [5]. In their pioneering investigation the very existence of group transformation was discov- 
ered within the renormalization procedure the result of which contains finite arbitrariness. Just 
this degree of freedom in finite renormalized expressions was used by Bogoliubov and Shirkov in 
Refs. [7] — [10]. Roughly speaking, this corresponds to the change l//i. 

The basic idea was that, instead of 1/re, one can use some other reference point p. This is 
equivalent to introducing of a new degree of freedom associated with the reference point scale. 
Instead of (75) we have 

e((3^//x^) = |l + ^ In ^ -h . . .| . (76) 

Here, the effective charge is considered after the subtracting of infinities and is given by a “finite 
representation” (76). The RG symmetry is formulated in terms of the scale and p represents 
the reference point. 

Another approach was used by Gell-Mann and Low [6]. Their paper was devoted to the short 
distance behaviour in a nonlocal QED with a cutoff A, and the “A degree of freedom” was used to 
analyze the UV behaviour. Instead of renormalization, there is a regularization and the charge is 
given by the “singular representation” 

e(A) = e ^1 + ^ InA^r^ H ) (77) 

which is singular in the limit A oo. 

We can draw a transparent picture (as was commented later by Wilson in his Nobel lecture) 
of the last approach. Imagine an electron of a finite size, smeared over a small volume with 
the radius Ri = h/cAi , ln(A^/mg) > 1. The electric charge Ci of such a non-local electron 
is considered as depending on the cut-off momentum Ai so that this dependence accumulates the 
vacuum polarization effects which, in reality, take place at distances from the point electron smaller 
than Ri. We deal with a set of models of the non-local electron correponding to different values 
of the cut-off Ai. Here, ei depends on Ri and the vacuum polarization effects in the excluded 
volume i?? should be subtracted. In this language, the RG transformation is the transition from 
one value of the smearing radius to another Ri ^ Rj, simultaneously with a corresponding change 
of the effective electron charge ej. In other words, the RG symmetry here is that related 
to operations in the space of models of non-local QED constructed in such a way that at large 
distances every model is equivalent to the real local one. 


5 RG symmetry in mathematical physics 

5.1 Functional self-similarity 

The RG transformations discussed above have close connection with the concept of a self-similarity (SS) . 
The SS transformations for problems formulated by nonlinear partial DEs are well known, since the 
last century, mainly in dynamics of liquids and gases. They are one parameter A transformations 
defined as simultaneous power scaling of independent variables z = {a:,t,. ..} , solutions /^(z) 
and other functions V'i(z) 

5a : Mz) fliz') = A‘^‘/fc(z') , Vi{z) ->= X‘'^Vi(z') } 



52 


D V Shiikiiv 


entering into the equations. 

To emphasize their power structure, we use a term power .st>lf-similanty 
Zel’dovich and Barenblatt, [63, 64] the PSS can be el;i.ssifie<i ;us: 


PSS. Afror(liM|,\ to 


a/ PSS of the 1st kind 

with all indices a, ... being (half)integer.s (Integer PSS) that are u.sually f.nm.i in.m the 

theory of dimensions; 
b/ PSS of the 2nd kind 

with irrational indices (Fi’actal PSS) which shtjuki be <lefined from dyuainirs. 

To relate RG with PSS, let us turn to the solution of th(‘ renornegroup FI*: 


Its general solution is known; it depencLs on an ari)itrary function of one argument soc 
However, at the moment we are interest(Hi iii a sp(*<*ial solution linear in thf* s<H‘ontl aigunMutt: 
g{x,g) = i?/(:r). The function f{x) should satisfy the equation f(xt) f(x)fif ) with the solution 
f{x) = x'' . Hence, 

This means that in our special case, linear in </, the RCJ transf{>rmation (i\) is ledtnasi to PSS 
transformation, 

Fit {x x/ ^ (j - St , ITS I 

Geuierally, in RG, instead of a power law, w(‘ have* arbitrary functional dt^pemkiici*. I htis, one 
can consider transformations (6), (15) and (17) as funetional generalization.^ of usual (i.e., |Hm*ei) 
self-similarity transformatioiLs. Hem^e, it is natural to rtdtu’ to tliem as to t in* t ransfor lin'd ions of 
functional scaling or functional (s<df)similarity (FS) rath(*r than to RCJ-tran.sforination.s. In shot!, 


RG FS , 


with FS standing for Functional Similarity. 

We can now answer the que^stion concerning th(» physical nu*aning of tlie .syininetry umkulying 
FS and the Bogoliubov’s reuorm group. As we hav<‘ mentiomsi, it i,s not a .syinint'liy of ilie 
physical system or the equations of the problem at hami, but a syrntnctrn of a Hvlutum consiiitued 
as a function of the relevant physical variabk*s and suitabk* bouiniary coiuliltcms. A symmetry 
like that can be related, in particular, to the invarianca* of a physical c|uaiitity described tiy this 
solution with respect to the way in which tin? boundary comlitimns are iin|>osed. I'he idiangiiig of 
this way constitutes a group operation in the sense that the group ('omposition law is related to 
the transitivity property of such changes. 

Homogeneity is an important feature of the physical sysbuus umler c‘oiiKidc»raliiin, However, 
homogeneity can be violated in a discrete manner. Imagine that sucli a discrete iiifiomogeiieily 
is connected with a certain value of x, say, x y. In this case tin* RC^ transformation with ttie 
canonical parameter t will have the form (14) with the group comjio.sition law (15). 

The symmetry connected with P'S is a very simple ami fnsiuently encounttufd propeity of 
physical phenomena. It can easily be “discovered” in mmumniH proldems of theoreti<'al physic’s 
like classical mechanics, transfer theory, classical hyrlrodynamirs, and .so on ((in, 17, Hi]. 


5.2 Recent application to boundary value problem 

Recently, some interesting attempts have been made to use the ItG amerpt m elmmml mathnnati^ 
cal physics, in particular, to study strong nonlinear regimes and to invimtigate asymptotic' beliavior 
of physical systems described by nonlinear PDFs. 

About a decade ago, the RG ideas were applied by late Veniamin Pustovalov witli c'o authurs 
[67] to analyze a problem of generating higher harmonics in plasma. l‘hi.s prijblem, aftt*r some 
simplification, was reduced to a couple of partial DEs with the boundary parameti»r “solution 
characteristic” explicitly included. It was proved that corresponding solutions admitted an exact 
symmetry group that takes into account transformations of this boundary paraim»ter, which is 



Evolution of Renormalization Group 


53 


related to the amplitude of the magnetic field at a critical density point. The solution symmetry 
obtained was then used to evaluate the efficiency of harmonics generation in cold and hot plasma. 
The advantageous use of the RG-approach in solving the above particular problem gave promise 
that it may work in other cases and this was illustrated in [68] by a series of examples for various 
boundary value problems. 

Moreover, in Refs. [65, 68] the possibility of devising a regular method for finding a special class 
of symmetries of solution to the boundary value problem (BVP) in mathematical physics, namely, 
RG-type symmetries, was discussed. The latter are defined as solution symmetries with respect 
to transformations involving parameters that enter through the equations as well as through the 
boundary conditions in addition to (or even rather than) the natural variables of the equations. 

As it is well known, the aim of the modern group analysis [69, 70], which goes back to works 
by S. Lie [71], is to find symmetries of DEs. This approach does not include a similar problem of 
studying the symmetries of solutions of these equations. Outside the main direction of both the 
classical and modern analysis, there remains as well a study of solution symmetries with respect 
to transformations involving not only the variables present in the equations, but also parameters 
entering into the solutions from boundary conditions^ 

Prom the afore-said it is clear that the symmetries which attracted attention in the 50s in 
connection with the discovery of the RG in QFT were those involving the parameters of the 
system in the group transformations. It is natural to refer to these symmetries related to FS (or 
RG-type) symmetries. 

It should be noted that the procedure of revealing the FS symmetry (FSS), or some group 
feature, similar to the FS regularity, in any partial case (QFT, spin lattice, polymers, turbulence 
and so on) up to now is not a regular one. In practice, it needs some imagination and atypical 
manipulation “invented” for every particular case — see the discussion in [72]. By this reason, the 
possibility to find a regular approach to constructing FSS is of principal interest. 

Recently, a possible scheme of this kind was presented as applied to a mathematical model that 
is described by a BVP. The leading idea [65, 68, 73] in this case is based on the fact that solution 
symmetry for this system can be found in a regular manner by using the well-developed methods 
of modern group analysis. 

The scheme that describes devising of FSS and its application is then formulated [74, 76] as 
follows. Firstly, a specific RG-manifold should be constructed. Secondly, some auxiliary symmetry, 
i.e., the most general symmetry group admitted by this manifold is to be found. Thirdly, this 
symmetry should be restricted on a particular solution to get the FSS. Fourthly, the FSS allows 
one to improve an approximate solution or, in some cases, to get an exact solution. 

Depending on both a mathematical model and boundary conditions, the first step of this pro- 
cedure can be realized in different ways. In some cases, the desired FS-manifold is obtained 
by including parameters, entering into a solution via an equation(s) and a boundary condition, 
in the list of independent variables. The extension of the space of variables involved in group 
transformations, e.g., by taking into account the dependence of coordinates of the renorm-group 
operator upon differential and/or non-local variables (which leads to the Lie — Backlund and non- 
local transformation groups [70]) can also be used for constructing the FS-manifold. The use of the 
Ambartsumian invariant embedding method [77] and of differential constraints sometimes allows 
reformulations of a boundary condition in a form of additional DE(s) and enables one to construct 
the FS-manifold as a combination of original and embedding equations (or differential constraints) 
which are compatible with these equations- At last, of particular interest is the perturbation 
method of constructing the FS-manifold which is based on the presence of a small parameter. 

The second step, the calculating of a most general group Q admitted by the FS-manifold, is 
a standard procedure in the group analysis and has been described in detail in many texts and 
monographs - see, for example, [69, 78]. 

The symmetry group Q thus constructed cannot as yet be referred to as a renorm-group. In 
order to obtain this, the next, third step should be done which consists in restricting ^ on a solution 
of a boundary value problem. This procedure utilizes the invariance condition and mathematically 



54 


D V Sliiikov 


appears as a “combining” of different coordinates of group geiun-ators adinit le 


'<i l>y ihv FS iiiaiiifuld. 


The final step, i.e., constructing analytic expression for tin* sohition of tlie Ixmndarv vahn* 
problem on the basis of the FS, usually presents no sptH'ific proldcuns. 

A review of the results, which were obtained on the basis of the fornnilated seheiHi*. ran be 

found, for example, in [76, 79, 80], 

We mention briefly, the FS analysis result for a [rarticular |)rol>lein of ntuilinear (»pfirs. the 
problem of the laser beam self-focusing in a nonlinear medium. Hen*, one have a H\d* f* »i a nmiAvtl 
system of two nonlinear PDFs with the boundary condition given in a form of two our are,iinif‘nt 
functions. With help of RG=FS approach oik ' new vxnci analylit* ami one new ii|»pioKiiu.iff‘ 
analytic solution (for the practically important Gaussian initial transverse protile) !ia,s !H*fn tnuiid 
[81]. 

The important qualitative features of this exam{)le are: 

- the two’dimension singularity stricture has been analyscnl, 

- the algebraic structure of the FSS operators is differ<‘nt frorti that of “u.sual lIC ! of iIm* 1 
type”. Here, we meet with a Lie algebra comprising several mfinitesinml opvrutvrs. 


Up to now the outlined regular method is fe*asibh‘ for sysU‘ins that can 1 h* tlescribod l»v Uh.s 
and is based on the formalism of modern group analysis. How(‘ver, it .scnuns als<* p«»ssibh» to exfriid 
this approach to boundary value problems that ar(‘ not d(‘scrih(‘<i just by diffmential et|uatioiiN A 
chance of such an extension is based on n^rent advanc<‘s in group analysis of sysimns af infcgi u 
differential equations [82] which allow transformations of both dynamical varialdes and fum tionaF 
of a solution to be formulated [83]. More intriguing is th(' isstu* of a possibility of ctuisinn iiug a 
regular approach for more complicated systems, in particular to those having an iiifinife mitnlH*f 
of degrees of freedom. The formers can be n^prescuited in a compact form by functioiial (m path! 
integrals. 


Acknowledgments 

The author is grateful to Professor Ashoke Mitra for invitation to participate in this lamk He 
is indebted to D.V. Kazakov, V.F. Kovalev, and I.L. Solovtsov for useful discnssi(»n and <’ianments 
This work was partially supported by grants of Russian Foundation ha Ftimlanienfal Hf^sean h 
(RFFR projects Nos 96-15-96030 and 99-01-00091) and l)y INTAS grant No fMidlH 12. 


References 

[1] N.N.Bogoluibovand D.V.Shirkov, Introduction to the Theory of Quantized Fields Wih-y IiiIi-im ,, 
N.Y., 1959 and 1980. 

[2] N.N. Bogoliubov and D.V. Shirkov, “Problems of (juanliun li.-Id thoory. I", Vxjivkhi h’u .\juik 
55, (1955), 149-214; ibid. 57, (1955), 3-92 - in Hu-ssiaa; Fort.schr. dtr Phynik 3 (li)r»r»), pp 
439-495; 4 (1956), pp 438-517 - in Gorman. 

[3] N.N. Bogoliubov and O.S. Parasyuk, Doklndy Ahicl. Nauk SSSR, 100 (195.5) 25 2H. Fi!) 432 
- in Russian; also Acta Mathernatica, 97 (1957), 227 260. 

[4] Laurie M. Brown, editor, Renormalization, Springor Vcrlag, N.Y., 1993. 

[5] E.C.G. Stiickelberg and A. Peterman, “La nornuilisation <i<'.s comstantcs (inns la <i«'.s 

quanta”. Helv. Phys. Acta, 26 (1953) 499-520. 

[6] M. Gell-Mann and F. Low, “Quantum Electrodynamics at Small Distances”, Phm. llrv. 95 
(1954) 1300-1312. 

[7] N.N. Bogoliubov and D.V. Shirkov, “On the renormalization group in <piantum cicctrodv- 
namics , Doklady AN SSSR, 103 (1955) 203-206 - in Russian. 



Evolution of Renormalization Group 


55 


[8] N.N. Bogoliubov and D.V. Shirkov, “Application of the renormalization group to improve 
the formulae of perturbation theory”, Doklady Akad. Nauk SSSR, 103 (1955) 391-394 - in 
Russian. 

[9] D.V. Shirkov, Doklady AN SSSR, 105 (1955) 972 - in Russian. See also in [10]. 

[10] N.N. Bogoliubov and D.V. Shirkov, Charge renormalization group in quantum field theory, 
Nuovo dm. 3 (1956) 845-637. 

[11] L.D. Landau, A.A. Abrikosov and I.M. Khalatnikov, Doklady AN SSSR, 95 (1954) 497; 773; 
1117; 96 (1954) 261 - in Russian; Nuovo Cim. Supp.3, 80-104. " 

[12] I.F. Ginzburg, Doklady AN SSSR 110 (1956) 535 - in Russian, 

[13] A.A. Logunov, Soviet Phys. JETP 3 (1956). 

[14] P.A.M. Dirac in Theorie du Positron (7-eme Conseil du Physique Solvay: Structure et propriete 
de noyaux atomiques, Oct. 1933), Gauthier-Villars, Paris, 1934, p 203. 

[15] N. Bohr, Phys. Rev. 48 (1935), 696. 

[16] D.V. Shirkov, Sov. Phys. Doklady 27 (1982) 197; The RG method and functional self-similarity 
in physics - in Nonlinear and turbulent processes in physics, Ed. R.Z. Sagdeev, Harwood 
Acad.Publ., N.Y. 1984, v.3, pp 1637-1647; D.V. Shirkov, “Renormalization group in dif- 
ferent fields of theoretical physics”, KEK Report 91-13, Feb. 1992. 

[17] M.A. Mnatsakanyan. Doklady AN SSSR. 262 (1982) 856-860. English transl. in Soviet Phys. 
Doklady 27 (1982). 

[18] L.V. Ovsyannikov, Doklady AN SSSR, 109 (1956) 1112. For English translation see pp 76-79 
in In the intermission ... , Ed. Yu.A. Trutnev, WS, 1998. 

[19] D.V. Shirkov, Theor. Math. Fiz. 119 (1999) 55; hep-th/9810246. 

[20] L.D. Landau and I.Ya. Porneranchuk, Doklady AN SSSR, 102 (1955) 489-492; I.Ya. Pomer- 
anchuk, ibid., 103 (1955) 1005; 105 (1955) 461 - in Russian; Nuovo Cim. 10 (1956) 1186; see 
also L.D. Landau, On the Quantum Theory of Fields - in Niels Bohr and the development of 
physics, eds. W. Pauli et al., Pergamon, London, 1955, pp 52-69. 

[21] E.S. Pradkin, Zh. Eksp. Teor. Fiz. 28 (1955) 750-752; English transl. in Soviet Phys. JETP 1 
(1955). 

[22] N.N. Bogoliubov and D.V. Shirkov, Doklady AN SSSR 105 (1955) 685-688. 

[23] S.G. Gorishny et al, Phys. Lett. 132B (1983) 351. D.I. Kazakov, Phys. Lett. 133B (1983) 
406. 

[24] D.I. Kazakov, O.V.Tarasov and D.V. Shirkov, Teor. Mat. Fiz. 38 (1979) 15. 

[25] L.N. Lipatov, Zh. Eksp. Teor. Fiz. 71 (1976) 2010. 

[26] D.I. Kazakov and D.V. Shirkov, Fortschr. d. Phys. 28 (1980) 465-499. 

[27] D.V. Shirkov, Sov. J. Nucl Phys. 34(2) (1981) 300-2; Theor. Math. Fiz. 49 (1981) 1039-42. 

[28] D.V. Shirkov, Nucl. Phys. B 371 (1992) 467-81. 

[29] D.V, Shirkov, Mass Effects in Running Coupling Evolution and Hard Processes, in Perspectives 
in Particle Physics, Eds. D.Klabucar et al., WS, 1995, pp 1-13; On continuous mass-dependent 
analysis of DIS data, in Proc. EPSHEP95 Conf. (Bruxelles, July 1995), Eds. J.Lemonne et al, 
WS, pp 141-2. 



56 


D V Shirkov 


[30] D.V. Shirkov, Mass and Scheme Effects in Coupling Constant Evolution, I cor. Mat. buika 
(1992) 93 466-72. 

[31] U.Amaldi et al, Phys. Lett. B260 (1991) 447-55; M.B.Einhorn and D.H.'rj..n(>s, NuriPhys. 
B 196 (1982) 475. 

[32] D.V.Shirkov, Sov. J. Nucl. Phys. 34 (1981) 300-2; Icor. Mat. biz. ((tSSll) 49 (Dim. 1981) 
291-7 [pp 1039-43 in the American ed.j. 

[33] V.Z.Blank and D.V.Shirkov, Nucl Phys.2 (1956) 356-70. 

m 

[34] D.V.Shirkov, NucLPhys.B 371 (1992) 467-81. 

[35] S. Weinberg, Phys.Rev.D 8 (1973) 605-625. 

[36] K. Wilson, Phys.Rev. 179 (1969) 1499-1515. 

[37] G. Altarelli and G. Parisi, Nucl Phys. B 126 (1977) 298-318. 

[38] D. Gross and P. Wilczek, Phys. Rev. D8 (1973) 3633-52; 

H. Politzer, Phys. Rev. Lett. 30 (1973) 1346-49. 

[39] D.I. Kaaakov and D.V. Shirkov, Singular Solutions of RG Eqs. and th«* Syinn»*fry of fhi* 
Lagrangian, JINR Preprint E2-8974, 1975, in High Energy Particle Interaction (ri'c><’«M*d, H)7r> 
Smolenice Conf.), Eds. D.Krupa & J. Pisut, Veda, Bratislava, 1976, 255-78. 

[40] W. Zimmermann, Comm. Math. Phys. 97 (1985) 211. 

[41] R. Oehme and W, Zimmermann, Comm. Math. Phys. 97 (1985) 569; R. Oelime. K. Siliold 
and W. Zimmermann, Phys. Lett. B 147 (1984) 115; B153 (1985) 142. 

[42] W. Zimmermann, in Renormalization Group, Eds. D.V. Shirkov et WS, Hingapure, 1988, pp 
55-64. 

[43] A.V. Ermushev, D.I. Kazakov, O.V. Tarasov, Nucl Phys. B 281 (1987) 72. 

[44] D.I. Kazakov, Phys. Lett. B 421 (1998) 211-216. 

[45] A. Dhar, Phys. Lett. 128 B (1983) 407. 

[46] A. Dhar and V.G. Gupta, Pramaha 21 (1983) 207; Phys. Rev. D 29 (1984) 2822. 

[47] V.G. Gupta, D.V. Shirkov and O.Tarasov, Mem. J. Mod. Phys.A 6 (1991) 3381. 

[48] D.V. Shirkov and I.L. Solovtsov, JINR Rapid Comm. No. 2[76]-96 (1996) 5, hep-ph/96()4363; 
Phys. Rev. Lett. 79 (1997) 1209, hep-ph/9704333. 

[49] D.V. Shirkov, Nucl Phys. (Proc. Suppl.) B 64, (1998) 106, hep-ph/970848(). 

[50] N.N. Bogoliubov, A.A. Logunov and D.V. Shirkov, Sov. Phys. JETP 10 (1959) 574. 

[51] K.A. Milton, I.L. Solovtsov and O.P. Solovtsova, Phys. Lett. B 415 (1997) 104. 

[52] L. Kadanoff, Physica 2 (1966) 263. 

[53] K. Wilson, Phys.Rev. B4 (1971) 3174-3183. 

[54] P.G. De Gennes, Phys. Lett.38A (1972), 339-40; J. des Cloiseaux, J. Physique (Parts) 36 
(1975) 281. 

[55] V.I. Alkhimov, Theor. Mat. Fiz. 39 (1979) 281; 59 (1984) 432. 



Evolution of Renormalization Group 


57 


[56] T.L. Bell et al, Phys. Rev. A17 (1978) 1049-1057; 

G.F. Chapline, Phys. Rev. A21 (1980) 1263-1271. 

[57] B.V. Chirikov, Lecture notes in physics 179 (1983) 29; B.V. Chirikov & D.L. Shepelansky, 
Chaos Border and Statistical Anomalies - in [58], 221; Yu. G. Sinai k K.M. Khanin, Renor- 
malization group method in the theory of dynamical systems - in [58], 251; A. Peterman and 
A. Zichichi, Nuovo Cimento 109A (1996) 341. 

[58] Renormalization Group, (Proceed. 1986 Dubna Conference), Eds. D.V. Shirkov, D.L Kazakov 
and A.A. Vladimirov, WS, Singapore, 1988. 

[59] C. DeDominicis and P. Martin, Phys. Rev. A19 (1979) 419-422. 

[60] L. Adjemyan et al, Teor. Mat. Fiz. 58 (1984) 72; 65 (1985) 196; A.N. Vasiliev, Quantum 
Field Renormalization Group in the Theory of Turbulence and in Magnetic Hydrodynamics, 
in [58], pp 146-159. 

[61] A.N, Vasiliev, Quantum Field Renormalization Group in the Theory of Critical Behavior and 
Stochastic Dynamics, PINF Publ., St-Petersburg, 1998, 773 pp - in Russian, also Gordon k 
Breach, Amsterdam (in press). 

[62] G. Pelletier, Plasma Phys. 24 (1980) 421. 

[63] Ja.B. Zel’dovich and G.I. Barenblatt, Sov. Phys. Doklady 3(1) (1958) 44-47; see also in [64]. 

[64] G.I. Barenblatt, Scaling, self-similarity, and intermediate asymptotics, Cambridge Univ. Press, 
1996. 

[65] D.V. Shirkov, Renormalization group in modern physics, in [58] pp 1-32, Int. J. Mod. Phys. 
A3 (1988) 1321-1341; Several topics on renorm-group theory, in [66] , pp 1-10. 

[66] Renormalization Group ’91 (Proceed, of 1991 Dubna Conf.), Eds. D.V. Shirkov and V.B. Priez- 
zhev, WS, Singapore, 1992. 

[67] V.F. Kovalev and V.V. Pustovalov, Teor. Mat. Fizika 81 (1990) 1061-1071; also: Strong 
nonlinearity and generation of high harmonics in laser plasma, in Proceed. Conf. Plasma Physics 
(Kiev, USSR, April 1987), Ed. A.G. Sitenko, Naukova Dumka, Kiev, 1987, 1, 271; Influence 
of laser plasma temperature on the high harmonics generation process, ibid., 1, 274. 

[68] V.F. Kovalev, S.V. Krivenko, V.V. Pustovalov, The Renormalization group method based on 
group analysis - in [66], 300-314. 

[69] L.V. Ovsyannikov, Group analysis of differential equations, Acad. Press, N.Y., 1982. 

[70] N.H. Ibragimov, Transformation groups applied to mathematical physics, Reidel Publ, 
Dordrecht-Lancaster, 1985. 

[71] M. S. Lie, Gesammelte Abhandlungen, Leipzig-Oslo, Bd.5, 1924; Bd.6, 1927. 

[72] D.V. Shirkov, Bogoliubov renormgroup, Russian Math. Surveys 49:5 (1994) 155 - with mis- 
prints. For corrected version see: The Bogoliubov Renormalization Group (second English 
printing), JINR Comm. E2-96-15; hep-th/9602024. 

[73] D.V. Shirkov, Intern. J. Mod. Physics C 6 (1995) 503-512. 

[74] V.F. Kovalev, RG- symmetries: constructing and applications - in [75], 263. 

[75] Renormalization Group '96, (Proc. 1996 Dubna Conf.), Eds. D.V. Shirkov, D.L Kazakov and 
V.B. Priezzhev, JINR Publ., Dubna, 1997. 



58 


D V Shirkov 


[76] V.F. Kovalev, V.V. Pustovalov, D.V. Shirkov, J. Math. Phys. 39 (1998) 1170-1188; hcp- 
th/9706056. 

[77] The embedding method was introduced by V.A.Ambartzumyan in A.str. Joiini. 19 ( 19-12) 31 ) 
- in Russian; later on this method enjoyed wide application to difren'nt jirohli'ins sec, 
J.Casti, R.Kalaba, Imbedding methods in applied mathematics Ackii.son-Wi'slev. lO'ading, Ma, 
1973 and references therein. 

[78] Peter J. Olver, Applications of Lie groups to differential equations, Springer, N. 198 (i; CHC 
Handbook of Lie Group Analysis of Diff. Equations, Ed. N.H. Ibragimov (CRC' Prt>.s.s, Boca Baton, 
Florida, USA) - in three volumes. 1994 -- 1996. 

[79] V.F.Kovalev, Group and renormgroup symmetries of boundary value i)rul»Ienis, in Modern 
group analysis VI, Eds. N.H.Ibragimov and F.M.Mahomed, 1997, N(‘w Age Infernat’l (P) Ltd 
Publ., India, N. Delhi, 225. 

[80] V.F. Kovalev and D.V. Shirkov, Tear. Mat. Fizika 121 (Oct. 1999) No.l. 

[81] V.F. Kovalev and D.V. Shirkov, J. Nonlin. Opt. Phys. & Matn. 6 (1997) -LL'l. 

[82] V.F. Kovalev, S.V. Krivenko and V.V. Pustovalov, Differ. Equations 29 (1993) No III. IfdiH; 
No 11, 1712. 

[83] V.F. Kovalev, S.V. Krivenko and V.V. Pustovalov, J. Nonlin. Math. Phys. 3 (199(i) 17.’) 180. 



2. The Similarity Renormalization Group 


Robert J. Perry and Sergio Szpigel 
Department of Physics 

The Ohio State University, Columbus, OH 43210 


Abstract 

Quantum field theories require a cutoff to regulate divergences that result from local inter- 
actions, and yet physical results can not depend on the value of this cutoff. The renormalization 
group employs a transformation that changes the cutoff to isolate hamiltonians that produce 
cutoff-independent eigenvalues. The similarity renormalization group is based on similarity 
transformations that regulate off-diagonal matrix elements, forcing the harniltonian towards 
a band-diagonal form as the cutoff is lowered. This avoids pathologies that plagued tradi- 
tion transformations acting on hamiltonians, making it possible to produce a well-behaved 
perturbative approximation of renormalized hamiltonians in asymptotically free theories. We 
employ a simple two-dimensional delta function example to illustrate this new renormalization 
technique. 


1 Introduction 

Early attempts to combine quantum mechanics and special relativity led to the consideration of 
local interactions, which are consistent with causality and avoid signals that propagate faster than 
light. Local interactions lead to divergences in perturbation theory, whose discovery caused some 
of the best theorists in the world to question the foundations of quantum mechanics. Eventual suc- 
cesses at fitting precise atomic experimental data led to the universal acceptance of renormalization 
recipes that wore acknowledged to make little sense [1]. Initially the perturbative renormalization 
of QED required theorists to match perturbative expansions in powers of a bare and physical elec- 
tronic charge [2], but the bare charge clearly diverges logarithmically in QED and the success of 
an expansion in powers of such a coupling was mysterious at best [3]. 

The first steps towards making sense of renormalization theory were taken in the 1950’s with 
the invention of the perturbative renormalization group [4, 5, 6, 7, 8], although serious investiga- 
tors found the theory was still plagued by noii-convergent sums because QED is not asymptotically 
free. The development of Wilson’s renormalization group formalism [9, 10] and the discovery of 
asymptotic freedom [11] allowed physicists to produce a logically reasonable picture of renormaliza- 
tion in whicli perturbative expansions at any high energy scale can be matched with one another, 
with no necessity to deal with intermediate expansions in powers of a large parameter. 

In this pedagogical article we take advantage of the fact that the divergences in field theory re- 
sult entirely from local interactions. To understand the most important aspects of renormalization 
theory requires only a background in nonrelativistic quantum mechanics, because as has been long 
known the divergences of field theory are directly encountered when one tries to impose locality 
on the Sdirodinger equation. In this case the interactions we consider that are local at all scales 
are delta functions and derivatives of delta functions. These divergences can be regulated by the 
introduction of a cutoff, and the artificial effects of this cutoff must be removed by renormalization. 
The simplicity of the one-body Schrddinger equation makes it possible to renormalize the theory 
exactly, disentangling the effects of locality from the complicated many-body effects and symmetry 
constraints encountered in realistic field theories. There is a large literature on the subject [12]-[29], 
primarily pedagogical 


B.mail:perry®mps. ohio-state.edu 


60 


Perry and Szpigel 



Figure 1: Two ways to run a cutoff on free energy. In (a) a cutoff on the magnitude of the energy 
is lowered from the solid to the dashed lines, with problems resulting from tlie removed shadc'd 
region. In (b) a cutoff on how far off diagonal matrix elements appear is lowered from the* dashc*d 
to the solid lines. 


The similarity renormalization group (SRG) is a very recent development inventc^d by Stan 
Glazek and Ken Wilson [30, 31], and independently by Franz Wegner [32]. We do not review the 
applications of this method, which are growing in number. 

In the SRG, as in Wilson’s original renormalization group formalism [33, 34], transformations 
that explicitly run the cutoff are developed. These transformations are the group elements that 
give the renormalization group its name. 

In his earliest work [33, 34] Wilson exploited a transformation originally invented by Claude 
Bloch [35]. It uses a cutoff on the states themselves, and as the cutoff is lowered, states are 
removed from the Hilbert space. If the hamiltonian is viewed as a matrix, these cutoffs can be seen 
as limiting the size of this matrix and the transformation reduces this size, as illustrated in Fig. 
la. Wilson introduced a rescaling operation to allow transformed hamiltonians to be compared 
with initial hamiltonians, despite the fact that they act on different spaces; however, the Bloch 
transformation is ill-defined and even in perturbation theory it leads to artificial divergences. These 
divergences come from the small energy differences between states retained and states removed l)y 
the transformation, and they appear in the form of small energy denominators in the perturbative 
expansion of the transformed hamiltonian. These small energy denominator problems led Wilson 
to abandon the hamiltonian formulation of field theory in favor of path integral formulations, but 
the virtues of the hamiltonian formulation over the path integral formulation for many problems 
remains. 

The breakthrough provided by the SRG is that the transformations are typically unitary, mak- 
ing them well-defined, and they run a cutoff on energy differences rather than on individual states, 
as illustrated in Fig. lb. Again viewing the hamiltonian as a large matrix, these cutoffs limit the 
off-diagonal matrix elements and as they are reduced the hamiltonian is forced towards diagonal 
form. The perturbative expansion for transformed hamiltonians contains no small energy denom- 
inators, so the expansion breaks down only when interactions become sufficiently strong, in which 
case perturbation theory should fail in any case. 

Although the SRG has not yet been applied to a wide range of problems, it may be an important 



Simihirity Renornmlization Group 


61 


new tool botli for attacking field theories and non-relativistic rnany-body problems. 

When the SRG is used with coupling coherence [36, 37], which we explain below, it allows 
us to construct effective theories with the same number of free parameters as the underlying 
‘fundamentar theory. For the delta-function example there is one fundamental parameter, the 
strength of the regulated delta-function as the cutoff is removed. In the SRG with coupling 
coherence, there is only one fundamental coupling and all new couplings are perturbative functions 
of the fundamental coupling tliat are given by coupling coherence. It is the renormalization group 
flow of the added couplings, and a boundary condition that they vanish when the fundamental 
coupling is takcui to zcno, that fixes their dependence on the fundamental coupling. 

The examph^s w(‘ use in this article do not illustrate non-Gaussian fixed points, so their scaling 
properties arc? driviui by naive dimensional analysis. However, we will see that even in these cases 
scaling behavior of (dfiH'tive hamiltoniaiis derived using a perturbative similarity renormalization 
group can be very complicated. We will see that in the perturbative SRG there are errors aris- 
ing from the ai)i)roximatc‘ treatment of the fundamental running coupling and the approximate 
treatment of tlu^ relation l)et,w('en this coupling and the new couplings of irrelevant operators. 

In a realistic' calculation tlu^ marginal coupling, which corresponds to the strength of the reg- 
ulated d(ita fiuH'tion, would l)e tit to data. In order to clearly illustrate the logarithmic errors 
that r(\sult from using th(* p<»rtiirbativ(» SRG equations, we approximate this marginal coupling 
in this article* rath(‘r than rtuiormalizing it nonperturbatively by fitting data. The strengths of 
the irrekwant operators, whi(’ii corn^spond to derivatives of the regulated delta function, are ap- 
proximat.(Hl using (‘xpansions in powers of the approximate running coupling that are fixed by 
coui)ling cohertmcc*. Tlu* approximate* running coupling differs from the exact running coupling by 
inverse* i>ow<*rs of logarithms of the cutoff, and the error analysis for the binding energy displays 
tlie resultant invt*rs(‘ logarithmic* errors in addition to power-law errors seen in all approximate 
renormalization gn)u|> cak'ulations. In addition there are errors in the strengths of the irrelevant 
operators r(*sulting from using a truncatc*d expansion in powers of the running coupling and an 
approximate* running coupling, both of which introduce inverse logarithmic errors in addition to 
the power-law c’rrors normally seen. 

The utility of the* r(‘normalization group rests on our ability to accurately determine and control 
the magnitude* of (*rrors re*sulting from the artificial cutoff. For perturbative calculations this issue 
is not (*ritical, Imt in all fi<*ld the*c)rie\s (and in our example) a scale is reached where the coupling 
becomes large* and a non-p<*rturl)ative calculation must be done. The renormalization group allows 
us to (‘lirninate* eus mu(*h perturbative physics as possible («.e., lower the cutoff as far as possible in an 
asymptotically fre*(‘ the*ory), so that the essential degrees of freedom that couple non-perturbatively 
can be isolat(*d. 


2 Similarity Renormalization Group 

In this scjction we review the general formulation of the SRG developed by Glazek and Wilson 
[30, 31] and a specific transformation developed by Wegner [32]. The reader may wish to skip the 
gemu'al formulation on a first reading. 

2.1 Glazek- Wilson Formulation 

Consider* a systcun described by a hamiltonian written in the form 

( 1 ) 

where h is the free hamiltonian and V is an interaction. 

In general, the hamiltonian can couple states of all energy scales and such couplings can be 
a source of ultraviolet divergences. The goal of the SRG is to obtain an effective hamiltonian in 
winch the couplings between high and low-energy states are removed, while avoiding any prob- 
lems from small energy denominators in effective interactions. The procedure is implemented by a 



62 


Perry arid Szpigel 


unitary transformation that gonsratos cffoctivo intoractions that reproduce the effects of the elimi- 
nated couplings. The effective hamiltonian cannot produce ultraviolet divergences at any order in 
perturbation theory as long as its matrix elements are finite. 

In our discussion we will use the basis of eigenstates of the free hamiltonian, 

h\i >= €i\i > . ( 2 ) 

We start by defining a bare hamiltonian, H^, regulated by a very large cutoff A (here with dimen- 
sions of energy) on the change in free energy at the interaction vertices, 

H^ = h + VA, (3) 

Va = /aVa, (4) 

7a = v + H^^, (5) 

where /a is a “similarity function”, Va is defined as the reduced interaction and Ha are coun- 
terterms that must be determined through the process of renormalization in order to r(;move A 
dependence in physical quantities. 

The similarity function /a regulates the hamiltonian by suppressing matrix elements betweem 
free states with significantly large energy difference and acts in the following way: 

< b' > = 5ij + {e^- Cj) < |y > 

= £i Sij + f . ( 6 ) 

Typically, the similarity function is chosen to be a smooth function satisfying 

- ^j) ->■ 1> when |ei - Cjl << A , 

- Cj) -> 0, when |ei - Cjj » A . (7) 

In several papers the similarity function has been chosen to be a step function. Although useful 
for doing analytic calculations, such a choice can lead to pathologies. 

The similarity transformation is defined to act on the bare regulated Hamiltonian, Ha, lowtuing 
the cutoff down to a scale A: 


Ha = UiX,A)HAUHX,A). ( 8 ) 

The renormalized Hamiltonian can be written in the general form 

Hx = /i + Va, (9) 

= fxVx. ( 10 ) 

The transformation is unitary, so Ha and Hx produce the same spectra for observables. Also, if an 
exact transformation is implemented, the physical predictions using the renormalized Hamiltonian 
must be independent of the cutoff A and is chosen so that they also become independent of A 
as A 00 . 

The unitarity condition is given by: 


t7(A,A)C/t(A,A) = 17t(A,A)17(A,A) = l. (11) 

The similarity transformation U can be defined in terms of an anti-hermitian operator 7\ (T^ = 
-Tx) which generates infinitesimal changes of the cutoff energy scale, 


U{X,A) = T exp Ty dX' 
where T orders operators from left to right in order of increasing energy scale A'. Using 


(12) 


Ta = U(A, A) t^^(A, A) , 


( 13 ) 



Similaxity Renormalization Group 


63 


and the unitarity condition Eq. (11), we can write Eq. (8) in a differential form, 

^ = [^a,T,] . (14) 

This is a first-order differential equation, which is solved with the boundary condition = 

H\. The bare Hamiltonian is typically given by the canonical Hamiltonian plus counterterms that 
must be uniquely fixed to complete the renormalization. 

The operator Tx is defined by specifying how Vx and h depend on the cutoff scale A. For 
simplicity in this article, we demand that h is independent of A, although this may not lead to 
an increasingly diagonal effective hamiltonian in all cases. We also demand that no small energy 
denominators can appear in the hamiltonian. These constraints are implemented by the conditions 

dh 

Tx ^ ( 15 ) 


dVx 

dX 


[Vx,Tx] . 


To obtain the renormalized Hamiltonian perturbatively, expand 


Fx 

Tx-- 

h:‘ 


_(1) 


_( 2 ) 








(16) 


(17) 

(18) 
(19) 


where the superscripts denote the order in the original interaction, V. A general form of these 
effective interactions is 



(20) 


_(i) 

for i = 2, 3,---, with = v. For insteince, the explicit form of the second-order eflfective 
_(=) 

interaction Vx is 

,(AA) ^(AA) 

( 21 ) 


_( 2 ) 


Vxij 


/ (AA) (AA)A 


(2).ct 


where 



( 22 ) 

(23) 


The counterterms can be determined order-by-order using the idea of coupling-coherence 

[36, 37]. This is implemented by requiring the hamiltonian to reproduce itself in form under the 
similarity transformation, the only change being explicit dependence on the running cutoff in the 
operators and the implicit cutoff dependence in a finite number of independent running couplings. 
All other couplings depend on the cutoff only through their dependence on the independent cou- 
plings. In general, we also demand the dependent couplings to vanish when the independent 
couplings are taken to zero; i.e, the interactions are turned off. If the only independent coupling 
in the theory is ax, the renormalized hamiltonian can be written as an expansion in powers of this 
coupling: 

Hx = 4“ otxO^^^ -f -f . . • . (24) 

In this way, the effective hamiltonian obtained using the similarity transformation is completely 
determined by the underlying theory. The procedure can be extended to arbitrarily high orders, 
although it becomes increasingly complex both analytically and numerically. 



64 


Perry and Szpigel 


2.2 Wegner Formulation 

The Wegner formulation of the SRG [32] is defined in a very elegant way in terms of a How equation 
analogous to the SRG Equation in the Glazek- Wilson formalism [30, 31], 

^ = ( 25 ) 

as 

Here the hamiltonian Hs = h + evolves with a flow parameter s that ranges from 0 to oo. 
The flow-parameter has dimensions l/(energy)^ and is given in terms of the similarity cutoff A by 
s = 1/A^. 

In Wegner’s scheme the similarity transformation in defined by an explicit form for t he generator 
of the similarity transformation, Tj = [Ha, Ho], which corresponds to the (dioicc' of a gaussian 
similarity function with uniform width. In the original formulation, Wegner advocatcis the inclusion 
of the full diagonal part of the hamiltonian at scale s in Ho- For a perturbative (uilculation of //,, 
we can use the free hamiltonian, Hq = h. With this choice, the flow equation for the hamiltonian 
is given by 

^=:[Ha,[Ha,h]]. ( 26 ) 

The reduced interaction, V^ij (the interaction with the gaussian similarity finuldoii factored 
out) is defined by 

Kii = L, Vsij , (27) 

. (28) 

Assuming that the free hamiltonian is independent of s, we obtain the flow eciiiation for the reduced 
interaction, _ 

^ = Y, {Ai, + Ajk) Vaik Vakj , ( 29 ) 

k 

where we use A?. “ A?^^ - A^^^. = -2AikAjk> We should emphasize that this is an extU'i (‘quation. 

To solve this equation we impose a boundary condition, Th(*n, we make* a 

perturbative expansion, 

— ^(1) -_(2) 

= 4-7, , (30) 

where the superscript implies the order in the bare interaction . It is important to observe that 
counterterms are implicit in the bare interaction and can be determined in the renormalization 
process using coupling coherence. 

At first order we have 


which implies 


—0) 

^ StJ ^ SqIO ) 


where s is the final scale. Because of the dimensions of the flow parameter we have s > Sq, 
corresponding to a smaller cutoff. The “no cutoff limit” corresponds to — > 0. 

At second order we have 
-_( 2 ) 

^ = Y . (33) 


Integrating, we obtain 







Similarity Renormalization Group 


65 


By construction, the Wegner transformation is unitary and avoids small energy denomina- 
tors. The Wegner transformation is one of the Glazek-Wilson transformations, with the similarity 
function chosen to be A- . = . 

•f AtJ 


2.3 Strategy- 

In our applications of the SRG we use Wegner’s transformation. The renormalized hamiltonian 
for the non-relativistic delta-function potential in D-dimensions is given by 

^^A(p.p')=p'<5^^’(p-p')+e‘^^^^ [t^i'^(p,p') + 'i^A'’(P.P') + -] , (35) 

where 

(36) 

Vi^\p,p') = a|,,Ff)(p,p'), (37) 

Vt\p,p') = a):,,Fi”)(p,p'). (38) 

Here A is a momentum cutoff (as opposed to the energy cutoff discussed above) related to the flow 
parameter by s = 1/A^. The index i denotes the order of the calculation for the running coupling. 

The renormalized hamiltonian can be used to compute eigenvalues and eigenstates. Since the 
hamiltonian is derived perturbatively we expect cutoff dependent errors in the observables. For- 
mally, we can regroup the terms in the renormalized hamiltonian and write it as a momentum 
expansion, and the expansion parameters are analytic functions of the running coupling a\. Ex- 
panding the operators in powers of p^/A^ we obtain 


HP) P ) - -2(0 + 2(2 23^2 ^ ^^^^4 + ^4 " 2 ^ ’ ’ 


where the Zi s are constants. Regrouping the terms we obtain 
^\{p,p') = (p - p') + e~ 


(jy^ -I. 

9o{o^x) + 92{oix) — 23(2 ^ • 


(39) 


(40) 


where 

gi{ax) = Ui ax + 6i 4- . . . . (41) 

We can identify three interdependent sources of errors in the perturbative similarity renormal- 
ization group when the hamiltonian given by Eq. (40) is truncated and used to compute a physical 
quantity: 


a) errors introduced by the truncation of the hamiltonian at a given order in p^/A^; 

b) errors introduced by the truncation of the hamiltonian at a given order in the running coupling 
Q'A,i, which correspond to the use of an approximation for the functions gi; 

c) errors introduced by the approximation for the running coupling ax,i. 


In the actual calculation using the hamiltonian given by Eq. (35) errors of type (a) do not appear 
directly because we do not truncate the operators that appear in the hamiltonian. However, errors 
of type (b) can be understood as coming from approximating the couplings in front of the operators 
in Eq. (40). Errors of type (c) appear in our calculations only because we do not fit the canonical 
coupling to data at each scale, but fix it at a given scale and evolve it perturbatively from that 
scale. The strategy we would use for a realistic theory (e.^., QED and QCD) is the following: 



66 


Perry and Szpigel 


1) Obtain the renormalized hamiltonian using the similarity transformation and coupling-coherence, 
truncating the hamiltonian at a given order in powers of ax,i- 

2) Fix the coupling ax by fitting an observable {e.g., a bound-state energy). 

3) Evaluate other observables {e.g., scattering phase shifts). 

As pointed out before, the evaluation of scattering observables with the similarity hamiltonian 
with standard techniques is complicated and so in our examples we focus on the bound state 
errors. We fix the coupling at some scale using a given renormalization prescription and u.s(! tlie 
flow-equation to obtain the coupling as a function of the cutoff A to a given order. We then 
perform a sequence of bound-state calculations with better approximations for the hamiltonian 
such that the errors in the bound-state energy are systematically reduced. Once the sources of 
errors are identified, it becomes relatively simple to analyze order- by-order how such errors scale 
with the cutoff A. In principle, to completely eliminate the errors proportional to some power 
m in the momentum expansion we should use the similarity hamiltonian with the exact running 
coupling (renormalized to all orders) and include the contributions up to C>(p"7A"*) coming from 
all effective interactions (all orders in ax). Some details of this scaling analysis are presented 
later for the specific examples we work out. We emphasize again that in a realistic c:alculation we 
would fit the coupling ax to an observable. This nonperturbative renormalization eliminates the 
dominant source of errors we display in SRG calculations in this paper. We choose to renormalize 
the coupling perturbatively in this paper because the only observable we compute is tht* single? 
bound state energy of a delta-function potential, and fitting this energy would prevent us from 
displaying errors. 


3 Two-Dimensional Delta-function Potential 

We now consider the case of two nonrelativistic particles in two dimensions interacting via an 
attractive Dirac delta-function potential. The Schrodinger equation for relative motion in position 
space (with /i = 1), can be written as: 

- V2$(r) - ao (5^2) (r) $(r) = E 4'(r) . (42) 

Both the delta-function potential in two dimensions and the kinetic energy operator scale as l/r'^, 
therefore, the coupling ao is dimensionless. As a consequence, the hamiltonian is scale invariant 
(i.e., there is no intrinsic energy scale) and we can anticipate the presence of logarithmic ultra- 
violet divergences, analogous to those appearing in QED and QCD. The problem requires renor- 
malization. In this subsection we present the standard method that produces an exact solution 
analytically, using simple regularization and renormalization schemes [18]. 

3.1 Exact Solution 

We start with the Schrodinger equation in momentum space, 

where $(p) is the Fourier transform of the position space wave-function, 

^(P) = ^ / ^(r) . (44) 

As a consequence of scale invariance, if there is any negative energy solution to Eq. (43) then it 
will admit solutions for any J5 < 0. This corresponds to a continuum of bound states with energies 



Similarity Renormalization Group 


67 


extending down to — oo, so the system is not bounded from below. By rearranging the terms in 
the Schrodinger equation we obtain 


$(p) ^ ^ 

27r (p2 + Eo) 


(45) 


where \t'(0) is the position space wave-function at the origin and So > 0 is the binding energy. 

To obtain the eigenvalue condition for the binding energy, we can integrate both sides of 
Eq. (45); 

1 = ^ / <iPP / ^ \ ■ (“^6) 

27r 7o ^^(p2 + So) ^ ' 

The integral on the r.h.s. diverges logarithmically, so the problem is ill-defined. 

The conventional way to deal with this problem is renormalization. First, we regulate the 
integral with a momentum cutoff, obtaining 


1 



dp p 


1 

+ So) 




(47) 


so that 


Eo 


A2 

47r 

e "0 — 1 


(48) 


Clearly, if the coupling ao is fixed then £o oo as A oo. In order to eliminate the divergence 
and produce a finite, well-defined bound state we can renormalize the theory by demanding that 
the coupling runs with the cutoff A in such a way that the binding energy remains fixed as the 
cutoff is removed: 

ao OA = — / . (49) 


In 




The dimensionless renormalized running coupling qa that characterizes the strength of the 
interaction is therefore replaced by a new (dimensionful) parameter JSq > 0, the binding energy 
of the system. This is a simple example of dimensional transmutation [38]: even though the 
original “bare” hamiltonian is scale invariant, the renormalization procedure leads to a scale that 
characterizes the physical observables. Note that Eq can be chosen arbitrarily, fixing the energy 
scale of the underlying (renormalized) theory. It is also interesting to note that the renormalized 
running coupling q;a vanishes as A -> oo and so the theory is asymptotically free. 

This renormalized hamiltonian can be used to compute other observables. The usual prescrip- 
tion for the calculations is to obtain the solutions with the cutoff in place and then take the limit 
as the momentum cutoff is removed to oo. If an exact calculation can be implemented, the final 
results should be independent of the regularization and renormalization schemes. As an example,, 
we calculate the scattering wave function. 


^*(p) = - k) + 


CCA 


$( 0 ) 


27r (p2 -- — i e) 

where k = VE. Integrating both sides over p with a cutoflF A in place, we obtain 


^(0) = 


thus, 


^^(0) = 


27r 


27r 

47r 


, OLK ^ , A^ 

1 — 7 ^ In I 1 -I rr r- 

47r V 


-I-l 


In 1 + 


n. 

Eq 


In the limit A oo we obtain: 


Q(A ’3/(0) = 


- ^ In ( 1 + 
47r 

2 


h? 


■ le 




In 




(50) 


(51) 


(52) 

(53) 



68 


Perry and Szpigel 


The resulting scattering wave function is then given by 




27r (jp - - it) [ 


In 


Eq 



(54) 


It is important to note that only S-wave scattering occurs, corresponding to zero angular 
momentum states. For the higher waves the centrifugal barrier completely screens the delta- 
function potential and the non-zero angular momentum scattering states are free states. 

The same prescription can be used to evaluate the T-matrix or the K-matrix. For the T-inatrix, 
the Lippmann-Schwinger equation with the renormalized potential is given by: 

T(p,p';t) = V'(p,p') + / nq,p‘-,k) . (. 15 ) 

Since only S-wave scattering takes place we can integrate out the angular variable, obtaining 


T(‘='»(p,p'; k) = V('=«)(p,p') + 1"^ dqq 


k) 




where 


y('=o)(p,y) = 


qa 

■27r 


The Lippmann-Schwinger equation for the “on-shell” T-matrix is given by; 


T^^=°)(k) = — tO 

27r 27r 




0 


1 


-hie 


(57) 


(58) 


Solving this equation and taking the limit A -+ oo, we obtain the exact “on-shell” T-matrix; 


To(k) = 



(59) 


Here and in what follows we drop the superscript and use the subscript 0 to denote the exact 
quantities. 

In the same way, the S-wave Lippmann-Schwinger equation for the K-matrix is given by 



K(p,p'-,k)=Vip,p')+vl^ dqq^^K{q,p';k), 

m 

and the exact 

“on-shell” K-matrix is given by 




(61) 

The “on-shell” K-matrix and T-matrix are related by 



Koik) = . 

l-fTo(k) 

(62) 

Using either 


(63) 

or 

, . , V 2 A: 1 



«:cot«fc)= 

(64) 


we can obtain the exact phase-shifts: 



Similarity Renormalization Group 


69 


3-2 Similarity Renormalization Group Approach 

In the two-dimensional case the canonical hamiltonian in momentum space with a delta-function 
potential can be written as 

H{p, pO = /i(p, p') + K(p, p') , (66) 

where /i(p,p') = — p') corresponds to the free hamiltonian and V(p,p') = -ao/(27r)^ 

corresponds to the Fourier transform of the delta-function potential. 

Integrating out the angular variable, the flow equation obtained with Wegner’s transformation 
in terms of matrix elements in the basis of free states is given by 

Vs{p,p') - dkk (2fc2 -p'2) v,{p, k) V,{k,p') . (67) 

In principle, we can set the boundary condition at s = 0 (no cutoff), i.e, 

Ha=o(p,p') = H(p,p') = p^5^^^(p~p') - ^ . ( 68 ) 

However, the hamiltonian with no cutoff produces logarithmic divergences, requiring renormaliza- 
tion. As we will see, the boundary condition must be imposed at some other point, leading to 
dimensional transmutation [38]. The reduced interaction Vs{p,p') is defined such that 

Vs(p,p') = e-<’>"-p'"^"Vs(p,p'). (69) 

Assuming that h is cutoff independent we obtain the flow equation for the reduced interaction, 

+p'")] 

xVa(p,k)Vsik,p'). (70) 


ds 




rOO 

/ dkk (2*2 - p2 _ p/2) 

Jo 


This equation is solved using a perturbative expansion, starting with 

t^i')(p,p') = ~ ■ (71) 

We assume a coupling-coherent solution in the form of an expansion in powers of asl2n, satisfying 
the constraint that the operators Fs^^ip^p') vanish when p = p' = 0, 

P.fep') = + E (^) - (72) 

n=2 


Note that the expansion parameter is as/27C. 

Using the solution Eq. (72) in Eq. (70) we obtain 


ds 


X 


da. 


I n jF^'^Vp p') -f- 

(27r)2 ds * ^’P) + °‘s 

Fdk k (2*2 -p2 -p'2) 

Jo 

+ E (g)” [-S + 1 (g)“ /'.""’(‘.P') 


27r 


n=2 


m=2 


(73) 


This equation is solved iteratively order-by-order in asf2Tr. Again, if Qs/27r is small the operator 
Vy'^ipyp') can be identified as the dominant term in the expansion of Va(p,p') in powers of p 
and p'. In the D = 2 case this operator corresponds to a marginal operator (since the coupling is 
dimensionless and there is no implicit mass scale). The higher-order terms correspond to irrelevant 
operators. 



70 


Perry and Szpigel 


At second-order we have 


where 


2tt ds (27r)2 * ds 


/f(p,p0 = tAi 

[Zny Jq 


(74) 


, 2«/2 


g-25 p^p 


(27r)^ 4s 

The equation for is obtained by taking the limit (p, jp') -4 0, 

1 daa 


where 


Integrating Eq. (76) from Sq to s, 


In terms of the cutoff A we obtain 


27r ds 

/?^(0,0) 

Ols,2 = 


= q2/(2)(0,0), 

1 1 


(27r)2 4s 


a 


•«o 


stt 


<^A,2 = 




l + ^ln 


(^) 


(75) 

(76) 

(77) 

(78) 

(79) 


In principle, knowing the value of ago for a given so we can determine the running coupling a# for 
any s. Since we cannot choose sq = 0 (Ao = oo), to use Eq. (78) we must specify a renormalization 
prescription that allows us to fix the coupling at some finite non-zero value of sq. We discuss this 
issue in detail later in this subsection. 

The equation for Fp\p^p') is given by 


Integrating from sq to s we obtain 


(80) 


^’,®(P,P') = fds'i 

J So 


4s' 


In ( - 1 - Ei(-2sp2 p'2) + Ei(-2so P^ p'^) 

So ' 


(81) 


Insisting that F^^^{p,p') = 0 when p,p' = 0 we obtain 

Fs^^iPyP') = ^ [7 +• ln(2« P^ p'^) - Ei(-2s p* p'^)] 


(82) 


At third-order we have 

1 da, , 1 ^ dF^^'>(p,p') 


4- 


27r ds (27r)2 


a: 


4- 




ds ^ (27r)2“*‘^-' ds 

= -al 1(2) (p, p') + q3 j(3) {p,p') , 


(83) 



Similarity Renormalization Group 


71 


where 


In the limit p,p' ->■ 0 we obtain: 


1 r'^ 

I dkk (2fc2 _ p2 _ p'2) 


1 das 
27r ds 


= a2/i2)(0,0)-a3 Jf)(0,0), 


where 


(0, 0) = ^ dk [Fi^) (0, fc) + Fi^) {k, 0)' 


(84) 


(85) 


( 86 ) 


Since (/c, 0) = Fs (0, fc) = 0, the term proportional to in Eq. ( 85) is zero. 
The equation for Fs^\p,p') is given by 

= -i 4’>(0,0) ff>(p,p') + /!«(p.p') . 

To obtain Fi^^(p,p') the integrals in k and s must be evaluated numerically. 

At fourth-order we obtain 


— 7?(2) 


(87) 


1 da s ^2 

2n ds (27r)^ 


dF^"^{p,p') 


da 


ds ' (27r)2 ds 

r.' 




1 , .Ifr’fap') 3 , da, 1 . 

(27r)3 ds (27r)3 * ds * ^ ^ ^ (27r)4 ds 

= -a] /P^(p,p') + ■fi^^(p,p') . 


where 


"t /•OO 

4'^(p.p') = 7^4 / dfcA:(2fc2-p2-p'2)e-2*[pV^+*^-*“(^’^+^'')l 
(27r)^ do 


( 88 ) 


X [Fi3)(p, /:) + FW(fc,pO + fP(p, k)Fi^\k,p’) 

(89) 

In the limit p, p' — )► 0 we obtain: 



(90) 

where 


7(4) (0, 0) = ^ j°° dkk 2fc2e-2**’ [f^^^ (0, k) + Fi^) (fc, 0)' 

(91) 

For dimensional reasons Eq. (90) takes the form 


das B 2 2 Bi i 

— = — a, - — Oj , 

(92) 


where F2 = ^ s-nd F4 can be obtained by evaluating /i^^(0, 0) (numerically) for s = 1. In terms 


of the cutoff A we obtain: 


d^x ^ D ^2 ^ p ^4 


(93) 



72 


Perry and Szpigel 


Integration of Eq. (90) leads to a transcendental equation which is solved numerically in order to 
obtain the running coupling as, 4. 

Qualitatively, the errors are expected to be a combination of inverse powers of A and powc'rs (or 
inverse powers) of ln(A) coming from the perturbative expansion in powers of a a for the <'o<-ffici(>nt.,s 
of the irrelevant operators and the perturbative approximation for aA. 

As pointed out above, to completely determine the renormalized harniltonian at a given order 
we need to specify the coupling at some scale Aq. The simplest way is to choose a value for tlie 
‘exact’ qao- Formally, this fixes the underlying theory; i.e., if wo had the exact hamilt.onian (t,o 
all orders) we could obtain the exact values for all observables. However, sine*' the hamilt onian is 
derived perturbatively and must be truncated at some order in practical calcailations, wi* can otdy 
obtain approximate cutoff-dependent results for the observables. Moreover, in this c;i.se the errois 
cannot be directly evaluated, since the exact values remain unknown. As tin (>xample, we cidculate 
the bound-state energy choosing axo = 1-45 at Aq = 100. In Fig. 2 we show the binding-emugy as 
a function of the cutoff A using the following approximations for the interaction: 


(a) marginal operator with coupling (oaq)) 


axo - 


Vxip,p') = -- 

(b) marginal operator with running coupling renormalized to second-order (aA,'j)i 


(94) 


Vx{p,p') = e- 


.k: 




(95) 


(c) marginal operator plus second-order irrelevant operator with running coupling rmiormalized to 
second-order (aA,2)FA^^), 


yx{p,p') 



(9(>) 


(d) marginal operator with running coupling renormalized to fourth-order (c^a.^). 


Vx{p,p') = - 


0:a,4 

27r 


-iz 




(97) 


(e) marginal operator plus second-order irrelevant operator with running coupling renormalized to 
fourth-order aA,4 , ) , 


VxipJ) 





(98) 


We see that as the approximation is improved the cutoff dependence is reduced. As A 00 all 
curves should approach the same binding-energy, which corresponds to the exact value, and as A 
becomes small the perturbative approximation breaks down. 

A similar prescription is to find axo ^.t Aq that produces a given binding-energy, Eq. Since the 
fitting is implemented using a truncated hamiltonian, this exxo Is an approximation that beconuts 
more accurate if we use a larger Aq and/ or include higher order operators. Although in this case 
we can evaluate the errors, the scaling analysis becomes complicated as A ->• Aq because at this 
point we force the energy to be the exact value and so the error is zero. 

As an example, we calculate the bound-state energy when the coupling is fixed at Aq = 100 to 
pve Eq — 1. In Fig. 3 we show the ‘errors’ in the binding energy using the same approximatioas 
listed above for the potential. As expected, all error lines drop abruptly to zero when A Aq, where 
the running coupling is chosen to fit what we define to be the exact binding energy. Away from this 
point we can analyze the errors. With the hamiltonian (a) (unrenormalized) we obtain a dominant 



Similarity Renormalization Group 


73 



Figure 2: The binding energy for the two-dimensional delta-function potential with various approx- 
imations for the SRG hamiltonian. The exact theory is fixed by choosing ao = 1.45 at Ao = 100. 


error that scales like ln(Ao/A), corresponding to the leading order error. With the renormalized 
hamiltonian (b) the dominant errors scale like [ln(Ao/A)]“^, indicating the elimination of the 
leading order logarithmic errors. With the hamiltonian (c) there is a small shift, but no significant 
change in the error scaling. The added irrelevant operator may remove errors of order (Aq/A)^, 
but these are smaller than the remaining [In (Aq/A)]""^ errors. With hamiltonians (d) and (e) in 
the range of intermediate cutoffs {Eq « A^ << Aq) there is also only a shift in the errors. The 
dips in (d) and (e) correspond to values of A where the binding energy equals the exact value. 

This behavior is a perturbative artifact that can be understood in the following way. Consider 
the Schrodinger equation with potential (d). Rescaling the momenta p Xp we obtain 

P^^ip) - ^ dqq #(g) = ^ $(p) . (99) 


and 


Ex = X^ 


[/o°° <^PP ^ /o°° ^PP /o°° ^(P) ^( 9 ) 


/o dpp\^(pW 


(100) 


As shown in Fig. 4, the coupling renormalized to fourth-order, aA, 4 , approximately freezes for 
small A and as a consequence the bound-state energy scales like Ea A^ x constant eventually 
becoming equal to the exact value and then deviating again. With the hamiltonian (e) the behavior 
is similar, with the dip occurring at a different value of A because of the irrelevant operator. For 
small values of A the lines converge, indicating the breakdown of the perturbative expansion. 

An alternative prescription is to use the potential derived in subsection 5.1 as the starting point 
for the similarity transformation. We introduce a large momentum cutoff A, define 

47r 


aso=o = OJA = 


( 101 ) 




74 


Perry and Szpigel 




0.0001 0.001 0.01 0.1 1 


Figure 3: The SRG errors in the binding energy for the two-dimensional delta-function poKuitiai 
using various approximations for the SRG hamiltonian. The exact theory is fix<‘(i by choosing 
Flo = 1 at Ao = 100. 



Figure 4: The SRG running coupling for the two-dimensional delta-function potential obtained 
with oao at Ao = 100 fixed to fit Eq = 1. 





Similarity Renormalization Group 


75 


and set all irrelevant operators to zero at sq = 0. Note that the coupling ax^ is fixed at Ao = oo by 
fitting the exact binding energy. With this definition the similarity hamiltonian with no similarity 
cutoff becomes well-defined and we can set all of the similarity transformation boundary conditions 
at s = 0. The previous derivation remains essentially the same. The only modification is that all 
integrals over momentum are cut off (p < A). 

At second-order we have 


_1^ 

27r 


ds ■^(27r)2“*>^ ds 


= ~a 


Ia ■fS(p>p') 


(102) 


where 



dk k {2k^ - _ p'2) e-2*[p"p'"+ft"-A"(p=+p'")] 

Jq 

1 

(27r)^ 4s 


(103) 


The resulting second-order running coupling and irrelevant operator are given respectively by 

"" 1 - [7 + In (2sA4) - Ei (-2sA4)] 

and 

■P’S (P> P') = ^ [7 + ln(2s p'^) - Ei(-2s p^ p'^)] 

+ i [7 + ln(2sA'‘) - Ei(-2sA‘‘)] 

- i [7 -Mn (s [(p2 - A^)^ -I- ip'^ - -{p^ - p''^f] ) 

-Ei(-s [(p^ - A^)^ -f- (p'^ - A^)^ - (p^ -p'^)^])] . (105) 


At third-order we have 


27r ds 


1 das, A 1 ^2 ^^8,a^^p) 2 das^A pt(2)/ /\ 


1_<.3 0.p') _ , + 

' (2^)3 ^s,A -^SjA vP’^^ / ' ^s,A •^s,aU^’P / ’ 


where 


Pi5(P.P') = * (2fc2 -p^ -p'2) e 

X [FW(p,ft) + Ff)(fc,p')] . 


+fc^-*='(p=+p'“)] 


In this case, 


and so 


P’S(0>^) = ^[7 + ln(25A^)-Ei(-2sA^)] 

= i [7 + ln(2sA^ - 2 s*2a 2) _ Ei(-2sA'‘ + 2 sA2a2)] , 

/S(O.O) = ([7 + ln(2sA^)-Ei(-2sA^)] 

- [7-|-ln(2sA'‘-2sfc2A=*)-Ei(-2sA^ + 2sfc=A2)]) . 


(106) 


(107) 


(108) 


( 109 ) 



76 


Perry mid S'/.pigel 


Since /i^](0,0) # 0 the term proportional to ^ ^q. (85) does not vani.sh. To obtain 

we evaluate M{0, 0) and solve Eq. (85) numerically. 




The equation for Fl^lip,p') is given by 


1 = -I 73(0,0) F3(p,/y) + I 


(27r) 


ds 


::i) 
‘ .v,A 


(p,p') ■ 


;ii()) 


To obtain F] ^(p,?') the integrals over k and s must be evaluated numcn'ically. In th(> limit. .s.\ ‘ ,x) 

with s fixed at some non-zero value 


f3(p>p') ^ [7 + p'^) - Ei(-2,s p- p'“)] 

(0,0 ) -to. 


and becomes indeterminate, requiring the coupling to l)e fixed at some .so 5^ 0, In t his wa)-, 
we recover the result of the previous prescription. 

Although less trivial, this prescription allows a more transparent error analysis. We can also 
extend the calculation to larger values of the similarity cutoff. A, and analyze the errors in a 
different scaling regime. In Fig. 5 we show the errors in the binding-encmgy olh.ained using the 
following approximations for the potential with A = 50: 


(a) marginal operator with coupling (ccao), 


Va.a(p,p') = 


‘27r 




(113) 


(b) marginal operator with running coupling renormalized to second-order (fu.A.z), 


Vx,a(p,p') 


ax, A, 2 
27r 




(IM) 


(c) marginal operator plus second-order irrelevant operator with running coui)ling renormalized to 
second-order (oa.a.z.F^^ 


Vx,a{p,p') 


QA,A,2 

27r 


(2|f )' rS.M 



(d) marginal operator with running coupling renormalized to third-order (aA.A.tt), 


^a,a(p,p') 


Qa,a.3 . 

27r 


(llfi) 


( 110 ) 


(e) marginal operator plus second-order irrelevant operator with running coupling renormalized to 
third-order cta.a.s , ) , 


Va,a(p,p') 


«A,A,3 


( 2 ^)' 


e 


( 117 ) 


There are clearly two distinct scaling regions when an additional large momentum cutoff A is 
placed on the initial matrix and a similarity cutoff is then applied. When the similarity cutoff is 
larger than A, we see power-law improvement resulting from the addition of irrelevant operators. 
Curves (a), (b), and ,(d) all have the same slope. None of these hamiltonians contains irrelevant 
operators, but the marginal coupling differs in each. All results become exact as the similarity 
cutoff goes to infinity, and these curves are close to one another because the coupling runs little 



Similarity Renormalization Group 


77 



1e-06 1e-05 1e-04 1e-03 1e-02 1e-01 1e+00 




Figure 5: The SRG errors in the binding energy for the two-dimensional delta-function potential 
using various approximations for the similarity hamiltonian. The exact theory is fixed by regulating 
the “bare hamiltonian” using a sharp momentum cutoff, A, and letting the bare coupling depend 
on A such that the binding energy is fixed. We use A = 50 and Eq = 1. 


in this region. Curves (c) and (e) show that there is a power-law improvement when irrelevant 
operators are added, and that once again when the similarity cutoff is larger than A an improvement 
in the running coupling makes little difference. Even though the coupling in front of this operator 
is approximated by the first term in an expansion in powers of the running coupling, the coupling 
is sufficiently small that this approximation works well and the operator eliminates most of the 
leading power-law error in curves (a), (b), and (d). 

When the similarity cutoff become smaller than A we see a crossover to a more complicated 
scaling regime that resembles the SRG scaling discussed above. The error displayed by curve 
(a) approaches 100%, while the running coupling introduced in curve (b) reduces the error to an 
inverse logarithm. Improving the running coupling in curve (d) further reduces the error, and 
we see that curve (c) crosses curve (d) at a point where improving the running coupling becomes 
more important than adding irrelevant operators. As above, the best results require us to both 
improve the running coupling by adding third-order corrections and add the second-order irrelevant 
operators. In no case do we achieve power-law improvement, because as we have discussed there 
are always residual inverse logarithmic errors. Had we fit the running coupling to data, as we 
would do in a realistic calculation, we would obtain power-law improvement and the residual error 
would be proportional to an inverse power of the cutoff times an inverse power of the logarithm of 
the cutoff. 

In Fig. 6 we show the running coupling at 2nd and 3rd order. Although the 3rd order corrections 
are small for all A and vanish when A oo, the improvement resulting from this correction in Fig. 
5 is significant. 

The scaling behavior with a large momentum cutoff A in place is complicated, but it is fairly 
straightforward to understand it and to find a sequence of approximations that systematically 
improve the non-perturbative results. The calculations become increasingly complicated, but at 
each order one must improve the running coupling, or fit it to data, and add higher order irrelevant 




78 


Perry and Szpigel 



X 


Figure 6: The SRG running coupling for the two-dimensional delta-function potential rc‘norrnali74*d 
to second and third-order obtained with q:ao=oo o;a = 47r/ln ^1 + . 


operators. In a field theory we need to let A oo and study the scaling behavior of the* thcH)ry in 
the regime where A < A. Although we do not display a compete set of figures, in Big. 7 we show 
what happens to the running coupling as A is increased, with the bound state energy fixed at one. 

As is evident in the exact solution, as A increases the coupling decreases. When A A, the 
coupling runs slowly and stays near its asymptotic value. As A approaches A the coupling begins to 
run noticeably, and when A becomes much less than A the coupling approaches a univ(»rsal curve 
that is insensitive to its asymptotic value. Plots of the error in the binding energy for various 
approximations and different values of A closely resemble Fig. 11, with two scaling regimes whose 
boundary is A = A. 

We close this section by reminding the reader that in all of these calculations there is only one* 
free parameter. In a realistic calculation we would fit this parameter to a binding energy and we 
would expect to see residual errors in other observables that is inversely proportional to powers of 
the cutoff and logarithms of the cutoff. 


4 Conclusions 


We have illustrated the similarity renormalization group method for producing effective cutoff 
hamiltonians using the two-dimensional delta-function potential. We have shown that the SRG 
with coupling coherence leads to errors that scale as inverse powers of the cutoff and irlverse loga- 
rithms of the cutoff. The SRG with coupling coherence requires the same number of parameters as 
the underlying ‘fundamental’ theory, but the cost is exponentially increasing algebraic complexity 
to remove errors that contain inverse powers of logarithms of the cutoff. 




Similarity Renormalization Group 


79 



X 


Figure 7: The SRG running coupling for the two-dimensional delta-function potential renormalized 
to second-order obtained with aAo=oo = 47r/ln (l + 


5 Acknowledgments 

We would like to acknowledge many useful discussions with Brent Allen, Martina Brisudova, Dick 
Purnstahlj'Stan Glazek, Billy Jones, Roger Kylin, Rick Mohr, Jim Steele, and Ken Wilson. This 
work was supported by National Science Foundation grant PHY-9800964, and S.S. was supported 
by a CNPq-Brazil fellowship (proc. 204790/88-3). 


References 


[1] J. Schwinger, “Quantum Electrodynamics,” Dover, New York, 1958. 

[2] P. A. M. Dirac, Theorie du Positron, (7-eme Conseil du Physique du Solvay: Structure et 
propiete de noyaux atomiques, Octobre 1933), pp. 203-230, Gauthier-Villars, Paris, 1934. 

[3] P.A.M. Dirac, in “Perturbative Quantum Chromodynamics,” (D.W. Duke and J.F. Owens, 
Eds.), Am. Inst. Phys., New York, 1981. 

[4] E. C. G. Stueckelberg and A. Peterman, Helv. Phys. Acta 26 (1953), 499. 

[5] M. Gell-Mann and F.E. Low, Phys. Rev. 95 (1954), 1300. 




80 


Perry ami Szpigel 


[6] L. D. Landau, A. A. Abrikosov and I. M. Khalatnikov, Doklady 95 (1954), 497; 96 (1954), 
261. 

[7] N. N. Bogoliubov and D.V. Shirkov, Nuovo Cim. 3 (1956), 845. 

[8] N. N. Bogoliubov and D.V. Shirkov, “Introduction to the Theory of Quantized Fields,” 
Interscience, New York, 1959. 

[9] K. G. Wilson and J. B. Kogut, Phys. Rep. 12C (1974), 75. 

[10] K. G. Wilson, Rev. Mod. Phys. 47 (1975), 773. 

[11] D. J. Gross and F. Wilczek, Phys. Rev. Lett. 30 (1973), 1343; 

H. D. Politzer, Phys. Rev. Lett. 30 (1973), 1346. 

[12] Ya. B. Zel’dovich, Soviet Physics JETP 11 (1960), 594. 

[13] F. A. Berezin and L. D. Faddeev, Sov. Math. Dokl. 2 (1961), 372. 

[14] C. Thorn, Phys. Rev. D 19 (1979), 639. 

[15] K. Huang, “Quarks, Leptons and Gauge Fields,” World Scientific, Singapore', 1982. 

[16] S. Albeverio, F. Gesztesy, R. Hoeg-Krohn and H. Holden, “Solvable Models in Quantum 
mechanics,” Springer- Verlag, New York, 1988. 

[17] C. R. Hagen, Phys. Rev. Lett. 64 1990), 503. 

[18] R. Jackiw, in “M. A. B. Beg Memorial Volume,” (A. Ali and P. Hoodbhoy, eds.), World 
Scientific, Singapore, 1991. 

[19] P. Godsdzinky and R. Tarrach, Am. J. Phys. 59 (1991), 70; 

C. Manuel and R. Tarrach, Phys. Lett. B 328 (1994), 113. 

[20] J. F. Perez and F. A. B. Coutinho, Am. J. Phys. 59 (1991), 52. 

[21] L. R. Mead and J. Godines, Am. J. Phys. 59 (1991), 935. 

[22] C. Manuel and R. Tarrach, Phys. Lett. B 301 (1994), 72. 

[23] T. J. Fields, K. S. Gupta, and J. P. Vary, Mod. Phys. Lett. A 11 (1996), 2233. 

[24] K. S. Gupta and S. G. Rajeev, Phys. Rev. D 48 (1993), 5940; 

R. J. Henderson and S. G. Rajeev, Inti. J. Mod. Phys. A 10 (1995), 3765; 

R. J. Henderson and S. G. Rajeev, J. Math. Phys. 38 (1997), 2171. 

[25] D. K. Park, J. Math. Phys. 36 (1995), 5453. 

[26] S. K. Adhikari and T. Frederico, Phys. Rev. Lett. 74 (1995), 4572; 

S. K. Adhikari, T. Frederico and 1. D. Goldman, Phys. Rev. Lett. 74 (1995), 487; 

S. K. Adhikari and A. Ghosh, J. Phys. A 30 (1997), 6553; 

C. F. de Araujo, Jr., L. Tomio, S. K. Adhikari and T. Frederico, J. Phys A 30 (1997), 4687. 

[27] R. M. Cavalcanti, quant-ph/9801033 (1998). 

[28] R. J. Henderson and S. G. Rajeev, J. Math. Phys. 39 (1998), 749. 

[29] D. R. Phillips, S. R. Beane, and T. D. Cohen, Ann. Phys. (N.Y.) 263 (1998), 255. 

[30] S. D. Glazek and K.G. Wilson, Phys. Rev. D 48 (1993), 5863. 

[31] S. D. Glazek and K.G. Wilson, Phys. Rev. D 49 (1994), 4214. 



Similarity Renormalization Group 


81 


[32] F. Wegner, Ann. Physik (Berlin) 3 (1994), 77. 

[33] K. G. Wilson, Phys. Rev. 140 (1965), B445. 

[34] K. G. Wilson, Phys. Rev. D2 (1970), 1438. 

[35] C. Bloch, Nud. Phys. 6 (1958), 329. 

[36] R. Oehme and W. Zimmermann, Commun. Math. Phys. 97 (1985), 569; 

R. Oehme, K. Sibold and W. Zimmermann, Phys. Lett. B 147 (1984), 115. 

[37] R. J. Perry and K. G. Wilson, Nud. Phys. B 403 (1993), 587; 

R. J. Perry, Ann. Phys. (N.Y.) 232 (1994), 116. 

[38] S. Coleman and E. Weinberg, Physical Review D 7 (1973), 1888. 



3. Quantum Field Theory and the Standard 
Model: Bird’s-eye-view 

V.Novikov * 

ITEP, Moscow, Russia and University of Guelpli, Guelph, Cauada 


Abstract 

We present the panoramic view of the Field Theory ideas that are used in the Standard 
Model of elementaxy particles physics. 


1 Introductory remarks : Brief Review of QFT 

The Standard Model (SM) pretends to be the f\indamental Tlieory of Natiiriu It givt\s pt^rfect 
description of the physical phenomena from the scale of binding (mergy of (‘lect rons in moleniles 
and atoms (i.e. from the electron-volt (eV) and less) to th(‘ scale of en<‘rgi(‘s of i)arti('l(‘s in nnxicnm 
accelerators ( i.e. to the hundreds Gev; 1 Gev = 10*^ eV ). This rc'gion of (‘inn'gy has !)e(‘n studied 
experimentally during this century and it seems there is nothing beyond SM. 'riius h<*lieve that 
the Laws of Nature in this region of energy are known and that tlie accurac'y of the d(*.S(’ri|)tion of 
any physical phenomenon depends only on our ability to perform mat.lnuuatical (’alculati<uis. 

On the other hand there still exist a few physical questions that rvuuxiu not an.swiMed in t,h(* 
framework of the SM. Say, we do not understand yet the spectrum of mass of (|uarks and le|)t.oris. 
Thus many people feel that the SM is incomplete theory and tliat th(‘r(‘ should he some New 
Physics beyond the Standard Model. 

The most recent evidence of the great success of the SM wfus conruHied with <l<‘taile<i study 
of properties of Z boson - fundamental particle that mediates electrow(‘iik int<‘ra<*tionH. SpcH‘ial 
huge electron - positron colliders SLC (at SLAG) and LEP I (at CERN) w{‘r(‘ constnn'tc^d at the 
end of 1980s to measure the parameters of Z boson decays with extrcmndy high acctiracy. More 
than 2000 experimentalists during ten years were involved in tln^se uni(|m‘ (*xpm'iments. Hundreds 
of theorists carried out detailed calculations of the required tiny corrtHdions. Tlie result of the 
collective quest for truth was remarkable - with the accuracy of the ()r<ler of si‘veral thousandths 
theoretical calculations reproduce the whole set of the experimental diita! 

The Standard Model is formulated in terms of the renorrnalizablc^ Quantum Field Theory 
(QFT). Another basic concept that lies in the foundation of tlie SM is tin" Principle of lo- 
cal gauge invariance. According to this principle the form of gauge interactions is uni(iii(‘ly 
specified by gauge invariance. The interactions are mediated by vector gauge l)osons that, are as- 
sociated with the group of symmetry. The symmetry group of the Standard Modcd is tiie product 
SQ(3)xSU(2)xU(l). Eight SU(3) gauge bosons (gluons) are responsible for strong interactions. 
Four SU(2)xU(l) gauge bosons ( two W bosons, Z boson and photon) mediate electrowciak intcu- 
action. In this paper we will consider electroweak interactions only. 

In the past Quantum Field Theory was considered as the esoteric theory accessible to the? small 
group of experts. Now QFT provides the working language inside the community of iiigh emu'gy 
physicists and the basic ideas of QFT, such as Feynman diagrams, are familiar to any member of 
the community. There are number of excellent textbooks on QFT. (The very incomplete list of 
the most recent books can be found in the References). As a rule they are rather lengthy. To 
get actual understanding of QFT one has to study one of these books. 


* Email : novikov<^heron. itep.ru 



QFT and Standard Model 


83 


The goal of this paper is not to provide systematic introduction to QFT or to SM, it is much 
more modest. We try to give a sort of panoramic view on the basic concepts, notions and relations 
of QFT and SM without long derivations and boring formalism. Nevertheless the paper is written 
not for pedestrians but for physicists. We suppose a general background in quantum mechanics 
and electrodynamics. 

The article arose from the lectures at 1998 European School of High-Energy Physics. 

1.1 Preliminaries: Particles and Fields 

First one should answer the very natural question why Field Theory is used to describe Particle 
Physics. 

Indeed in the Classical Physics particles and fields are very different dynamical systems. The 
system of particles has finite number of degrees of freedom N. To describe the physical state of 
particles one has to know the general coordinates qi{t) {i — 1,2, ..N) and their time derivatives 
qi{t) (or conjugate momenta Pi{t)) at any time t. Euler-Lagrange or Hamilton equations of motion 
govern the dynamics of the system. Either we study the bounded motion or the scattering processes 
at any time the number of degrees of freedom of the system is fixed. 

Field theory is a theory of the system with infinite number of degrees of freedom. The well 
known example of the fields in Classical Physics is the electromagnetic field. To describe the 
electromagnetic fields we have to know four-potential at every space point x and Maxwell 
equations govern the evolution of the fields A^j, in time. 

Particles and fields are quite different. This is evident! 

In Quantum Mechanics (QM) of non-relativistic particles dynamical system with N degrees of 
freedom is described by wave function 

^iq,t) ( 1 . 1 ) 

that satisfies the wave equation 

( 1 . 2 ) 

where H (p, q) is the Hamiltonian. Canonical conjugate coordinates p and q are replaced by op- 
erators satisfying canonical commutation rules [p, q\ = —i. Thus in coordinate representation the 
operator of momenta p is differential operator: p = —idjdq. The number of degrees of freedom N 
is supposed to be fixed exactly like in Classical Mechanics. 

The first quantization of electromagnetic fields as the dynamical system with infinite number 
of degrees of freedom had been done in 1926 by Born, Heisenberg and Jordan just in their second 
paper on QM. They represented radiation electromagnetic field as- an infinite set of harmonic 
oscillators and quantized these oscillators. They found that excitations of the oscillators behave 
exactly like a free massless particles - photons, but the number of photons was not fixed. Photons 
could be created and annihilated by charged particles. Quantized theory of electromagnetic field 
became a theory of particles - photons. Photons were not ’’bound” inside charged particles, they 
were created from ’’nothing” by scattered charged particles. Though the physical idea of photons 
was not very new (it was introduced by Einstein twenty years before this paper), this step was very 
important. The formal quantization of electromagnetic field showed that the quantized field is 
equivalent to the system of particles that can be created and destroyed. 

Nevertheless for some time physicists continued to treat massive particles (electrons) and elec- 
tromagnetic fields (photons) as something different. They tried to find a relativistic version of the 
Schroedinger wave equation (1.2) for the particles at high energy. The first such equation for spin 
0 relativistic particles ( Klein-Gordon equation) was written in 1926 by many authors 

= m^$(r) (1.3) 

where = d/dx^ and ^{x) is a complex function of x = (t, x), m is a mass of particle. 



84 


V Novikov 


Immediately it was pointed out that eq. (1.3) and the function <I>(:r) can’t lx* int(‘ri)retcd as the 
wave equation and the wave function. Such interpretation led to a nurnl)(*r of i)hysical i>aradox<\s. 
Later in 1928 Dirac suggested another relativistic equation (for spin 1/2 partMes): 


Here is a column with 4 complex components (4-spinor) and art* 4 x 4 mat rict‘s. 

The troubles with interpretation of eq. (1.4) as the on(^-[)article r(‘Iativisiic wave* rnpiation wvrv 
not so evident as for the case of eq. (1.3). But the truth is that for any relativistic i>n)ress<‘s the 
single particle description should break down. Indeed any relativistic’ systenn has inliniie numl>«*rs 
of degrees of freedom. The more energy we pump into the systcun, the more* d(*gre(‘s of frecdoiii can 
be excited. Say any process in Quantum Electrodynamics (QED) can lx* a(‘compani(‘d !)>* <Teaf ion 
of any number of additional pairs. These pairs are not hidden inside* init ial partiel(‘s, they 
are created from vacuum. 

Like in the case of photons the natural description of relativistic systt»m with varir*d immlx*!* 
of degrees of freedom is the quantum theory of the appropriate fi(*ld. It is wrong to divide world 
on particles and fields - one has to use the quantum field theory for 
language for dealing with particle physics! 

^Erom this point of view the description of QFT as a second (pianti/^ation sounds v(‘ry mish‘ad‘ 
ing. Nobody quantizes wave functions since both Klein-Gordon and Dirac (Xiuaf ions (as well as 
Maxwell equations) are not relativistic equations for wav(^ function. They are fi(*ld equations for 
scalar and for spinor fields respectively. Moreover nolxxiy (juantizes cla.ssi<’al fields as wt*!!. lire 
is no straight way from Classical Physics to Quantum Th(‘ory. llu* Fundamental llieoiy is tlx* 
Quantum Field Theory, The Classical Theory is the special limit of QFT and oix* should start 
from QFT and not vice versa. 

1.2 Quantization of Free Fields and Fock space. 

In this subsection we demonstrate that <iuantiim field tlu'ory indeed <iescrib<>s particles. 

Consider first the very simple example - free scalar field: 

In the ’’Classical Theory” $(a;) is areal function of .space-time jarint x,, = (f,/). It rc-presents th<‘ 
set of general coordinates of the system with label x. For free field there i.s only one <'hoiee for 
Lorenz-invariant Lagraiigian density £ (f , 0^ $) : 

( 1 . 6 ) 

where and the coefficient m has dimension of the imiss. 

The action S is given by 

S = J d‘^xC{^,d^,<S>) (1.7) 

To quantize this field we need Hamiltonian and canonically conjugate momenta. Tire Hamiltonian 
density Ti is constructed according to the rules of Hamiltonian dynamics 

where 

is the conjugate momenta. 

We use the natural units where c = 1 and ft = 1. In these units the action is dimensionle!.ss 

[5] = m°, 



QFT and Standard Model 


85 


and for dimension of the other quantities one gets 

[£?] = [p] = m 

[x]=m~^ (1.9) 

[C] = [%] = m^ 

[(()] =m 


Euler-Lagrange. equations of motion are derived from the Hamilton least action principle 


ss = o 

(1.10) 

and coincide with Klein-Gordon equation 


{d^ 4- = 0 

(1.11) 

If we consider the plane wave solution for eq. (1.11) 


$^(x,t) = a{p,t)e"^^ 

(1.12) 

the equation for the amplitude a 

d 4- (j^ + m^)a = 0 

(1.13) 


looks exactly like equation for linear oscillator with frequency 

uj^ip) = ^ . 


It is crucial that the dependence of frequency a; on p is exactly the same as the dependence of 
particle energy on momentum p (in the units ft = c = 1). This is why one can use free fields to 
describe free particles. 

The general solution in the periodic box can be presented as a superposition of the solutions 

( 1 . 12 ) 

f (x) = 5][a(p)e-*P* + a+(p)e‘^’='] (1.15) 

P 


where 


px = p^x^ = 


po = a;(p) = 



d^p 

(27r)32po 


In terms of these variables the Hamiltonian is equal to 


H — J d^xH = 


(1.16) 


This is the Hamiltonian for the set of decoupled linear oscillators. In the Classical theory 
coefficients a(p) and ia'^ip) are canonically conjugate variables. According to canonical procedure 
of quantization we have to replace them by operators that satisfy commutation relations 

[a{p),a^{f)] = 6ppr 


[aip),a(p)] = [a+(p),a+(p')] = 0 


(1.17) 


Operators a(p) and a'^(p) are familiar from QM. They are the annihilation and creation operator 
for oscillator with frequency (aj{p). 

The Fock space is the Hilbert space of the states with definite values of the operators of 
particle number: N{p) = a’^(p)o(p): 



86 


\ ' Novikov 


vacuum state 


r |o> 

\ «(p)10 >= 0 , 


QiHvparticIe states 


Ip >r= a^(p)|0 > ( 

two- particle state s 

iPl,P2 >= > 

Ip;;; >= \/2a'^ (p)a^ (p)|{) > 
etc. 

Commutation relation eq. (1.17) corr(?spon(is to Bose-EiiLstein .statistics lor spin (I particle. 
Indeed 

|7;i,/; 2 >= 4”|/;2,pi > 

The operator of energy 

// = X^u;(p)(N(p)+ ‘l {1.20) 

is well defined operator and it is l)ounded from ladow by th(‘ viKumin (*nergy 

^'.ar - 52 (1-21) 

The space of excited states of fic'ld Ovscillatoi’s - Fock spaia^ retjresents tlie staieH of frtM? 
particles with imiss 7n, with given momenta, mid with positive muugy jhi - \/p^ 4 no’. 

The theory of the cornph ^x scalar Imlds <1 >(j') - f with l.agrangian density 

( 1 . 22 ) 

is equivalent to the theory of two different s(*alar partie!(‘s wit h degemuate inmBm. The general 
solution of the field e<|uati<)ns can be prt\sent(*d in th(» form 

<{>( 1 ) Y^iaipy 4 h{p) * ( 1 . 2 : 5 ) 

V 


where the operatorK (a, a'*') and (f/,ly^) are creation and annihilation operutor.s fur tw<i different 
particles with the same masses but with the opposiU* electric charges (ses* the next <-hai)ter). 'I'his 
is some new phenomena: 

QFT predicts that for any particle there should exist anti-particle. 

For Dirac, spinor field ® (x) the Lagrangian d(‘n.sity can be written as 

-C = #[f7A - «]«' • C.2d) 

The dimension of field '4'; [^j = The plane wave .soluticins of the Dirac (‘<ination look like 

u(p,A)e‘»'*, (A = ±1/2) 


t;(p,A)e-’P% (A = ±1/2) 
where u(p, A), v(p, A) satisfy equations 


(1.25) 


(JmPh - m)‘u(p, A) = 0 



QFT and Standard Model 


87 


(1.26) 


(TpPm + rn)v(p, A) = 0 

and A = ±1/2 label the independent solution with different value of the spin projection on momenta 
p. The general solution of Dirac equation can be presented as the superposition of plane wave 
solutions 

$(x) = {a(p.A)u(p, + b+(p,X)vip,X)e^P^} . (1.27) 

p,x 


The dynamical coordinates a(p, A), A) and 6(p, A),z6'^(p, A) are conjugate variables. 

The next step is the quantization. We have to consider a(p, A) and 6(p, A) as the operators in 
the Fock space. The great surprise is that to get well defined operator of energy we should not 
follow the procedure of canonical quantization. Instead the operators a(p, A) and 6(p, A) should 
satisfy anti-commutation (not commutation) relations 


{a(p,A),a+(p',V)} = {6(p,A),6+(p',A')} = 


(1.28) 


{a, a} = {a'^,a+} = {b,b} = {6+,6+} = 0 


with {A, B} = AB ± BA. 

Only in this way we get that the energy is bounded from below and that the local observables at 
equal times commute for separated space points. Antucommutators imply Fermi-Dirac statistic for 
spin 1/2 particle. The operators n(p, A), a"^(p, A) and 6(p, A),6'^(p, A) are annihilation and creation 
operators for particles and anti-particles respectively. 

These two examples demonstrate the famous spin-statistics theorem: 

QFT can be self-consistent if and only if the identical particles with integer spin 
obey Bose-Einstein statistics and the particles with half-integer spin obey Fermi-Dirac 
statistics. 

For Electromagnetic Field (a:) the Lagrangian density is 


C — iij/ ± ej^ A fj,, 

Ffj,u = d^A^ ~ dyA^, (1.29) 

[Afj] = m,\j„] = 

The general solution for radiation field can be presented as a superposition of plane waves 


A,{x) = X^{a(p,A)e,(p,A)e-‘J’* + a+(p,A)£;(p,A)e*^q. 

Because of the gauge invariance the formal quantization of electromagnetic field technically is 
a little bit more subtle matter. We will not touch this subject in our paper. 

We have constructed the physical states for free particles. It is also useful to have the amplitudes 
that describe the propagation of free particles from one space-time point to another one. Consider 
scalar fields first. The part of the field operator (1.15) that represents the terms with positive 
frequencies 

f(+) = ^a(p)e-’P* 

P 

annihilates particle at point x, while the operator 

f(-) = ^a+(p)e‘P* (1.30) 

P 


creates the particle at point x. 

Thus the vacuum expectation of the time-ordered product of fields 



88 


V Stnikav 


represents the amplitude for a particle to propagatt' from jioint U to point x (heyimian potpagatiu ). 
Time ordering implies that creation always comes before aimihilatioii. Hi*!*' the l>\son,s time 
ordered product of operators T is delimKl ius follows; 

T{^>(:r),<I>(0)} = r) i 1 ,;i2) 

where the step function is ecjual to 


B{J-) 


I if X > (I 
0 if X < U 


The Feynman propagator of scalar jtarticle has very sim!>li' form in the momeutum lepjeNentatinn 


Drip) 


])' - in‘ t f 




Feynman propagator of (dt'ctrons Sy[x) 


in momentum representation looks like 



It.r.) 


(1.36) 


That is the en<l of our excursion into the (jUJintir.ation of the fns* fields, 

1,3 Interaction: Feynman Diagrams. 

What we do understaml wtdl i.s tin; QFT in the framework of ptTtnrlmtitm thetiry when otie can 
separate Lagrangian into quadratic frtie term £o and interaction t<‘nn £„,( that can be considered 
as a small perturbation to £o- fb’hl theory provides the asymptotic |in > ami lout s. stales for 
particles (and anti-particles). The nonlinear intitractiou term in perturbation tlMsuy prtivides 
the tran.sition amplitude.s from one asymptotic state to another one. 

Transitions from the initial asymptotic states to the final states are describerl !>y m«‘ans of 
unitary 5-matrix: 5+5 = / 

< /|5|i >=< /|i > +{2n)*iS^*H^Pf ~ Sp.) < /|Tli > (1.37) 

where i and / refer to initial and final state. 

In perturbation theory 5 matrix is given liy Feynman-Dyson series of integrals over tin»‘-ordere<l 
products of Cint in the so called interaction representation £; 

5 = Texp{i j d^xCi) = 


(L38) 



QFT and Standard Model 


89 


^ + d^xCi{x) + ^T{J d!^XiCi{xi); j d^X2Ci{x2)} -i- 


This representation of 5-matrix can be translated into the language of Feynman diagrams. 
According to Feynman there is a set of two basic elements: propagators and vertices. Propagators 
were found in the previous subsection, vertices depend on interaction. To calculate the amplitude 
of any of physical processes one has 

1) to draw all distinct diagrams for the process combining propagators and vertices in all 
possible ways, 

2) to assign amplitudes for the propagators and for the vertices in given diagram and to multiply 
them, 

3) to sum the contribution of all distinct diagrams. 

Consider as an example the QED of leptons. In this case the interaction is given by the product 
of electromagnetic current j^J^{x) and 4-potential 




(1.39) 


= i-ie){e(x)j^e{x) + 

where e(x) represents electron field operators, dots are for muon and r leptons contribution, and 
e is electric charge of the proton. 

Feynman rules for this QED Lagrangian are summarized in Fig. 1. 


initial states 


electron 

e(p-) 

positron 

• e(-p+) 

photon 

e^(k) 




final states 

• — ►— 



e0_) 

e{-p+) 

e;ik) 


p—m+ie 


y — 

/ ij/iC 

Figure 1: Feynman rules of QED. 


Using these rules one can easily construct any transition amplitudes. 

Consider as an example the process e+e" -t There is one diagram for this process 




90 


Novikov 


Tlie aniplitudo T it) equal 


iT{e- 


/t+/r 


) = (-tr) 


./»t 


(/O 

m"”'” 


(i.-id) 


where 


jj‘'> ~ H,r(p ) 

j'"’ =/i(A-, )7.,//( A-, ) 

This is the example of the arni)litu<it‘ in the low(\st order in the (‘oupling eonstaut r. It eontains 
no loops. There is special name for such <iiagrams - irvv diaKrams. 

The stei)s from th<‘ QF'T to tin* calculation of cross s(*clions and of the decay rates are very 
simple. Using Feynman diagrams one ('ah'ulatt'S lirst the amplitud(‘s oi the process 7\ then the 
square of tlie modulus T and finally t)iu' performs tin* suininati(Ui over all dt'gentuale final ,statc‘s: 
Probability S|7Y“. 

More precisely tlu* cross sections ar(* calculate<i by tin* formula 


da/i - I'b/f 


wht*re 


dr - ( 27 r)'bv^Pj \ ;o 


•odl 


./ 1 


(2,-r)'2C, 


is N-particle plijisc sitacc and 




is r(*lativisti(’ flux. 

The decay rat(*s arc* givt*n l)y formula 


dV 

There exists well d(*v<‘loped routiiu* U‘chnology of that type of calculations. 

Consider now oiu’-loop diagrams. W(* fo(*us on tin* corrertionH to the idioion |>itipagatr)r with 
rrioineiitum q. Then* is only om* stich diagrams 



According to Ffjymnan rules tin* corntction to the prtrpagator is (Hpral to 


SDfj,u — ■ 






where 




d*p 

(27r)* 


(-l)Sjry« 


I 


•: r-Ji)- r 

p-m + te p - g 


J 

-m + ie 


(lAlj 


( 1 . 42 ) 



QFT and Standard Model 


91 


For large virtual momenta p the loop correction diverges quadratically. If we regularize integral 
by means of introduction cut-off A the result of integration looks like 


^a(3 




[d^ 

J 


Qape^ — )■ 00 


(1.43) 


for A 00 . 

More sophisticated regularization demonstrates that quadratic divergence actually disappears 
and that integral diverges logarithmically. 

This is the simplest example of the problem of divergences in QFT. It was a great success of 
theoretical physics when Dyson, Feynman, Schwinger and Tomanaga in the late 40th explained 
how to work with such theories. 


1.4 Renormalizable Field Theories QED. 

The general philosophy of renormalization can be formulated in the following way: 

1) Suppose that we can separate all quantum fluctuations into the ’’fast” fluctuations (i.e. with 
virtual momenta p > A) and into the ’’slow” ones (p < A), where A is arbitrary large parameter 
(cut-off). 

2) Suppose that we can integrate over the ’’fast” fluctuations in some way (even though the 
physics at small distances (p > A) can be unknown yet). 

3) For ’’slow” fluctuations we get ’’effective field theory” with {A) or [A) that govern 
the dynamics at low momenta. Effective field theory parameters depend on cut-off A. 

4) For renormalizable theories {A) depends on finite number parameters and interaction 
terms. In this case one can express these parameters in terms of the same number of low-energy 
parameters and cut-off A. The cut-off A can be rid of low-energy observables if one rewrites them 
in terms of low-energy parameters. 

The renormalizable quantum field theories are the very reasonable from the physical point of 
view. They describe the situation when the large scale dynamics does not depend on the details 
of short distance physics. 

Let us look how this program works in the case of QED. The form of the effective Lagrangian 
is fixed by gauge invariance (see next chapter) 

£(A) = 

(1.46) 

where all quantities with label B depend on A. 

Consider the scattering of heavy charged particle on the Coulomb field. 

In this case we have to sum up the loop corrections to the photon propagator with electron- 
positron pairs inside 



As a result the amplitude of Coulomb scattering can be written as 



(1.47) 



92 


Novikov 


The coefficient, in front of l/(f is by definition th(> charge of particle (l/f/‘ corre.sptauis to 1/r 
dependence in the Coulomb law). So we claim that coinl)iuatiun 


,(A) 


' p/i - 


111 

\'2n^ m: 




is the physical charge that can be experiiiK^ritally m(*a.sur(‘ti. In t liis way w«‘ iiiul ( A ) as a fund ion 
of physical charge and cut-off A. 

In the similar way one can define the physi<‘al (*I(*ctron mass - 7 / 0 . as a poh* in t he exact 
propagator of tlie elect.ron. 

Now we are able to formulate the main stateimuit: if one rt'writns tlu* aniplifud<*s of any QED 
process that depend on ch, niB and A in terms of C;,/,, tli(‘ d(‘pen(h‘iu’<‘ on A in t best* ani|)litudes 
will disappear for large A! 


1,5 Non-renormalizable Theories, 

The first theory of w'eak interactions was formulated by Fermi in H)34. It was very similar to QIT). 
The Lagrangian of interaction was (‘(jual to a prcxluci of two vts'tor cnniuds (aft«*r the discovery 
of P and C parity violation this •l-ft'rmion theory was inodifhsi sliglit ly) 

- (h‘\injn (I. ‘19) 


where j,, = d ••• 

The Fermi coii{)ling constant has dinumsion -2. In<ie<*d 

[j] - til' : (£] m'‘ 

SO that 


[f- 




ni 


Tiie raxliative corrcictions to 4-f(*rmion intera<*tion are given l>y diagrams 



/,Prorn the dimensional analysis it is clear tliat the restili i)f (’alcmlatioii shtnild la* of t he ortier 
of 


GFil + Wh^f^DUf (idO) 

where A is cut-off. 

It is also clear that 4-ferrnion int(‘racti(>n can gf‘n(*rate innlii-fruinion interaction with divi*rgent 
coupling constant, e.g. 8- fermion interaction 





QFT and Standard Model 


93 


A£ = CGU^n + S(Gi. A2)"](i)^ (1.51) 

etc. 

In this way we find that should have interaction. Actually it has to have infinite number 
of terms. Why these divergent corrections still allow one to rely on the lowest order approximation, 
remained a mystery up to discovery of the SM. 

Another example of non-renormalizable theory is the theory of massive vector bosons. Though 
coupling constant in this case has dimension zero, the longitudinal components of vector fields 
interact strongly and in general case the theory is non-renormalizable. This is why all naive 
attempts to construct a renormalizable weak interaction theory with intermediate vector bosons 
failed in the past. 

These are two examples of non-renormalizable theory. In such theories one has to fix infinite 
number of terms in at the scale A (i.e. at small distances x A”^) to reconstruct the 

amplitudes at low energy. It means that we need a lot of information how Nature is constructed 
at the scale A. Nobody knows yet how to work with non-renormalizable theories. 

2 Symmetry. 

In the Standard Model the principle of local gauge invariance plays the same role as the principle of 
local Lorenz-invariance plays in General Relativity. This geometrical principle uniquely determines 
the self-interaction of gauge bosons and their interaction with matter. The great problem is that 
the same principle prohibits gauge bosons to have masses. As we know massless vector bosons do 
not exist in Nature with one exception - photon. How to make gauge bosons massive and not to 
destroy geometrical beauty? The solution of this problem was found in the spontaneous breaking 
of local symmetry - very beautiful phenomenon discovered first in the solid state physics. 

2.1 Global symmetry. 

i7(l) symmetry. Consider as an example the theory of free electrons. Electrons are described by 
Lagrangian density 


C = - m)ij}{x) (2.1) 

where ipiii = 1,2, 3,4) is 4-component complex field, = V'‘*‘7o and jfj, are 4 x 4 Dirac matrices, 
and m is the electron mass. 

The C/(l) global phase transformations 

Ipix) -> fix) = efipix) , . 

leave Lagrangian (2.1) invariant. The global symmetry means that the phase of the transformation 
is the same at any space-time points x. 

SU{2) symmetry. Consider the theory of two complex self-interacting scalar fields with the 
degenerate masses 


where # is the two component column (doublet) 


(2.3) 


# = 


( V^i^) 

I /(^) 


The Lagrangian (2.3) is invariant under global 817(2) rotations of the complex doublet $ 


(2.4) 


$(x) -)• $'(x) = 5$(x) 

$+(x) -> ($'(a:))+ = f+(a;)5+ 


(2.5) 



94 


Novikov 


where S is unitaxy 2x2 matrix 

S+S = I 

detS = 1. (2.6) 

The SU(2) transformations are global, i.e. matrix S docs not, depend on span>tinu> ]>c)int,s x. 
According to the Noether’s theorem for any continuous global symmetry of the Lagrangian 
one can construct the conserved vector currents. This dynamical statmnent is very Ijeaul iful and 
rather non-trivial. We prove the theorem in the classical field theory. 

Let Lagrangian L depends on the set of fields (p‘ and its first derivatives For 

infinitesimal global transformations the variations of fields are equal to 

here are the real infinitesimal parameters, oru' for each imk'ixmcit'ut symiiu't ry t ransfonnations, 
matrices are the generators of th(‘ group of traiisfonnations in giv(‘n rc'prt'sentat ions. 

The invariance means that the action 5 is not changed umhn* transformation (2.7); 


SS = J d'xSL = 0 

Calculate now the variation of Lagrangian d(*nsity (lirtx;tly 

dL = + 

where we have used the Lagrangian eciuation of motion for 

OL __ ()L 

Oip’ 

Substituting the variations for 6ip and rfep,,, w(‘ get 

ss = J d‘'xc),j‘;(x) () , 


where 


Ju ^11 T 


Thus we get the conservation of No(>th('r currents 


( 2 . 8 ) 

(2.9) 

( 2 . 10 ) 

( 2 . 11 ) 

( 2 . 12 ) 


(2.1:$) 

and the conservation of tin; corresponding cliarges 

~Q^''Kt) = 0 {2.M) 

g(«>(t)=: (2.1,1) 

The generalization of this proof to the Quantum Ffidd 'I’lieory rec}uir<‘s inor<‘ advanc<’(i tech- 
niques such as operators algebra, commutators etc. The final re.sult, i.e. the (‘xpression for con- 
served Noethers current, remains the same. Noethers curnmts that <-orrespond to ('(1) ami SU{2) 
symmetries look like follows : 

C(l) : 3n = '4nt^'l> . 

SU{2) : i2 = $+T“c)^4> ^ ' 

where 

f ^ 

and r“ are Pauli matrices. 



QFT and Standard Model 


95 



Figure 1: Higgs Potential in Standard Model 


2.2 Spontaneous Breaking of Global Symmetry : Goldstone Phenomenon 

The phenomenon of spontaneous breaking of symmetry is known from the everyday life. Consider, 
for example, the piece of ferromagnetic material. The interaction of the elementary magnetic 
moments of electrons inside ferromagnetic is 0(3) invariant. On the other hand at low temperature 
T < Tc the total magnetic moment of the ferromagnetic piece is nonzero. This spontaneous 
moment breaks 0(3) invariance of the system: ground state is only 0(2) invariant with respect to 
the rotations around aI. 

Prom the solid state physics it is known that the “violated” symmetries are realized as a massless 
excitations. In field theory analogous phenomenon is known as Nambu-Goldstone realization of 
symmetry. We consider this phenomenon using a very simple field model studied many years ago 
by Goldstone. 

Consider the theory of complex scalar field (^{x) with Lagrangian 

C = d^^-^dt,ip-V{\ipf) (2.17) 

and with a special choice of potential (see Fig.l) 

(2.18) 

Lagrangian (2.17) is invariant under 17(1) transformations 

ip[x) -)■ ^\x) = e^^ifipc) (2.19) 

and the Noether current is 

du, (p ( 2 . 20 ) 

There are continuously many minima of the potential V 

^ ( 2 . 21 ) 

The vacuum corresponds to one of these minima. This is spontaneous breaking of symmetry: we 
have chosen as a vacuum state one of the infinite set of minima. 

Let the vacuum state corresponds to zero phase a = 0 : 


( 2 . 22 ) 



96 


V Novikov 


and consider the small fluctuation of fields near this vacuum config^uation 

^ =—[?/ + p(x) + iaix)] 

In terms of this fluctuations the potential can be rcwritt,en as 


(2.23) 


V((p) = V(p,<r) = ^ {(a'^+p-f + -ii/pip- + a-) + 4,rp~} 


{2.2-1 ) 


The coefficients in front of bilinear- terms deterinint? the in.tss of the fiidds. So we get a tlieory 
of two particles with masses 

M; = 4X,r , , 

= 0 ' ' 

We can use more elegant and traiispannit rei)re.seut.ati()n for y'(.r) to demonstrate tlie same 
phenomenon. Rewrite (p(x) in terms of modulus and phase 


<p(x) = p(xy"^'^> 


(2.2(i) 


In this case 


£(p,<t) = {d,,pf - V{p-) + p‘(0,,af (2.27) 

There is no dependence on the field a in the f)()tenti<il and th(‘r<dbre this field rories|)uiids ttJ 
massless particle. 

Excitations that corresponds to the motion along the vall(»y of miniinfi are nia,ssle.s.s! Hiis is 

Goldstone phenomenon. 

In Quantum Field Theory th(n-(‘ are two ways of r(‘alization of symiiM^try: 

1) Vacuum state has the symmetry of tiie ac'tion 5. Excitation .stat(\s art" tit‘gt*nerat(‘. 

2) Vacuum state lias lower symmetry than action S, Tht^rt* are fiat dir(‘ciion in configuration 
space of fields. The motions along these fiat tiin^ctions corrt^spoml to ma.sslt*SH (#olrlstone 
particles. 


2.3 Local U(l) gauge symmetry. 

Now we turn to the local gauge symmetric's and start with the theory of complex field ^(t) described 
by the (eq.(2.17)) 


£ = d,y+d„<p - V(y 4>). 

This Lagrangian is invariant undt'r globiil f/(l) transformation 

m -4 . 

Consider now the local U{1) transformation when we ehangt* the jrluuse of the fiehl in<lep<'ndently 
for any point x 

4>{x) -4 4>'{x) = (2.28) 

The potential V{\4>\'^) = V{\(j)'f) is invariant under this triuisformatioa but the kinetic term is not 

-4 1(3;, + (2.29) 

To conapensate this non-invariant change one cam introduce new vector field A^i{x) with tlte trans- 
formation law : 

Af,{x) -4 A'^{x) = Af,{x) + -dyix) 


(2.30) 



QFT and Standard Model 


97 


so that the new Lagrangian 


C = \{d,-ieA,)4>\^-V{\<l>\^) (2.31) 

is locally U(l) invariant or gauge invariant. The combination - ieA^ has a name of 

covariant derivative (or long derivative) . It has a simple transformations law 


D,(t> 


(2.32) 


Up to now the fields i4^(x) have no kinetic term in the Lagrangian and they are some kind 
of the auxiliary fields that do not propagate. To construct kinetic term we need gauge invariant 
combination of the derivatives of field A^. Notice that covariant derivatives and any combina- 
tions of the covariant derivatives have a very simple transformation law eq. (2.32). Consider the 
commutator of two derivatives, 


[V^Vu] = 

F jxv ~ d^Aj/ di/Afj^ 


(2.33) 


We see that commutator is not the the differential operator but the function of x. According to 
(2.32) it is gauge invariant function. Now we are in position to write the total gauge invariant 
Lagrangian 


(i){x) (^'(x) = e^^(®)(/)(x) (2.34) 

A^{x) A'^{x) = A^,{x) + \d^h.{x) 

The notion of gauge invariance was introduced by V. Fock in 1926 and in two steps by H. Weyl in 
1919 and 1929. 

2.4 Spontaneous Breaking of local symmetry : Higgs Phenomenon. 

For the case when time derivative is zero = 0 and electric field is zero Fqi — Ei = 0 the 
Lagrangian (2.34) formally is equal to the free energy in the Ginzburg-Landau phenomenological 
theory of superconductivity, where ip{x) plays a role of the order parameter. It is known that mag- 
netic field does not penetrate into superconductor, it falls exponentially. Exponential fall in QFT 
corresponds to a massive particle. So one can expect that Lagrangian (2.34) at certain circum- 
stances can describe the massive gauge field. This is the famous Higgs mechanism of spontaneous 
breaking of local symmetry. 

Consider Lagrangian (2.34) with the special choice of potential energy (2.32) 

-c = -^F^ + (2.35) 

Potential V (</») has continuous valley of minima. Let us quantize the fields near the vacuum 
state (2.22) 


<ip>= ifcl = 



(2.36) 


As in the case of global symmetry it is convenient to use representation of in term of 
modulus and phase 


.^(:r) = ^(r? + p(x))e-(*) (2.37) 

The Lagrangians (2.34) and (2.35) are gauge invariant. Then let us make gauge transformation 
with A{x) = — o’(a:) 



98 


V Novikov 


In this gauge (unitary gauge) 


4>(x) 


(j)' = 

A'u, - .4^ - \dt,a 


0^4) 


icA^) ^^( 7 / + p{x)) 


and the Lagrangian can be rewritten in the form 


\[d^,pf + {ehi)p(x)Al{x) + y.4^(x)/7“(x) 
The term in bracket represents the free nuissiv(> ve(d,<)r jjarticle with mass 


(2.38) 


(2.3<)) 


( 2 .. 10 ) 


Tlly = (-7) (2.41) 

Massless Goldstone mode o-(x) has been eaten by massless vector field .4^,(x) (that had two po- 
larization) and as a result we get massive vector fittld with three polarization. This is Higgs 
Phenomenon. 


2.5 Local SU{2). Yang-Mills theory of vector fields. 

To be ready for the construction of the Standard Modc'l w(‘ have to consider t he gener;tl case of 
the local gauge groups. 

Let us start with 517(2) tluxiry of massh'ss ft'rmions V' - ^ ^ 

£ = (2.42) 

and consider local SU{2) transformations 

0(x) -4 7p'(x) = S{x)‘il>{x) (2.4.3) 

where 

5(x)=expi(2jAj(x)) ; 

2) = jTj , 1=1,2, 3; (2.44) 

[T,, T^] = ie.,*T* 

The Lagrangian (2.42) is not invariant under thi.s transformation. To compensate the non-invariant 
piece in the Lagrangian we introduce the tripUit of vector fields i4J,(x) so that: 

£ = - igA,Ax))7p 

A„(x)=TM‘(x) 

with the transformation law 


A^{x) A'^ix) = S-4^(x)5+ - -{d^S)S+ (2.46) 

Q 

One can introduce the covariant derivative 

'Dp, = 5^ — igAp (2.4 f) 

that transforms as a triplet under SU (2) transformations: 

Dp SDpS"^ 

DpiP-^ S{Dpij) 

We can define the triplet of field-strength tensor Gpp : 


(2.48) 



QFT and Standard Model 


99 


G,. =Gi,T^ = i[V„V.] 

= di,A^ - dvA^ - ig[A^Au] (2.49) 

Gm. -+ G;, = 5G,„.S+ 

and construct the SU (2) gauge invariant Lagrangian 

C = -^Tr[Gi^yG^y] 4- (2.50) 

This Lagrangian was written first time by Yang and Mills in 1954. The very nontrivial part in this 
construction is that kinetic energy ~ G^^^ contains bilinear ~ trilinear and quadralinear 
^ A^ terms: 


Thus Yang-Mills gauge theory is a theory of self-interacting vector fields. 

2.6 Spontaneous Breaking of Local SU{2) Symmetry : 

Renormalizable theory of massive vector fields. 

Consider the SU{2) gauge theory of the couple of scalar fields cj) = 

C = -\TrG,,G,. + - A (|^|2 - (2.51) 

We expect that after spontaneous breaking of SU{2) symmetry three Goldstone bosons will be 
mixed with three massless vector fields and produce three massive vector fields. 

Let us introduce a special representation for the doublet (/> 

and consider gaug^ transformation with the parameter 



A^(r) = -cr^(r) (2.53) 

In this gauge the fields cr^{x) disappear from the Lagrangian and vector part of C gets the form 

f CyQct = — \rnyA^^ 

\ mv = \grf ^ ‘ 

This is the theory of massive vector fields with the special choice of self-interactions. 

The theory of massless Yang-Mills field was renormalizable theory. On the other hand the prop- 
erties of the vacuum should not change the behavior of the amplitudes at high energy. Thus one 
can believe that Yang-Mills theory with spontaneous breaking of gauge symmetry remains renor- 
malizable. The theory of massive vector fields with arbitrary interactions is non-renormalizable 
in general. But if one takes the special case of interaction with quarks, with scalars and self- 
interaction that corresponds to the gauge-invariant Lagrangian (2.51) the non-renormalizable di- 
vergences should disappear. Technically the rigorous proof of this statement is quite nontrivial 
business even now. This problem had been solved by t’Hooft and Veltman in 1971. 



100 


V Novikov 


3 SU(2)xU(l) Theory of Electroweak Interactions. 

There is no unique way to construct the theory of electroweak interactions. In 19/ Os there were 
several dozens of models on the market. Only the simplest one has survivcul in our tiim*. A priori 
we do not understand why the gauge group is SU(2)xU(l), why theie aic’ thiet' gi'iieiations of 
quarks and leptons etc. All these question have no answer insi<l(> SM. We have to look for the 
answers into experiments. 

It was well established in the old four-fermion theory of weak interactions that clmrgisl cm rents 
(responsible for /3-decay of leptons and hadrons) have K - .4 structure, i.e. tliey are /'onsti ucted 
from the left-handed fermions. 

The minimal group of gauge symmetry which includes charged ve(!tor curr/mts is Sl’(2) grou]). 
Thus any theory of weak interactions have to include SUi'l},. symmetry as a subgroui). i’hoton 
interacts both with left- and right-handed fermions. To unify weak tind elecl.n /magnetic intera/ tions 
the group of gauge symmetry should include f/(l) as w(‘ll. The minim.a! group of .symmetry that 
includes these subgroups is the product 


G = SUi2)i X (7(1). 


3.1 Left and Right Fermions. .57/(2)/, symmetry. 

Any Dirac 4-spinor ^ can be pre.sented as a sum of two W<>yl spinors 4'/, and 4'/r: 

= 'i'l, (1) 'in “ ^(l + 75 )* h .^(1 ■ 7.'.)'!' {.'id) 


Two-component Weyl sirinors 
fermions 


are irreducible n'l/ntsr'ntations of I.orentz group, 


4'/, = 


( 1 - nn)ip 

“(I - rfn)9 


hid mtussle.ss 
Ci.'i) 


where ip is 2-spinor, <f are Pauli matrices, and n = j7/|p| is the direction of the mot ioti of particle. 
So for left particle 4* /, 

an ~ - 1 (a.a) 


and for right particles 4' // 


(fn = H 


(a.4) 


Left leptons and quarks group into 5f/(2)/, douhh'ts. hui the first gemuations they are 


To avoid V + A charged current we have to put right fermions into singlet l epresi-ntation. 'I'lius 
en, Ur and dn are singlets. As for right-handed neutrino vr nohoriy hits ol/sm verl it so far. It is 
unknown whether such field exists. Just now wr? prefer not to introduce ur into th<> theory. 

To include the electromagnetic interactions we have to defintt ciuirge. For left-lmmied fermions 
the charge is different for up and down component. Thus 

Ql = n + Yu (J-fi) 

where Tz is the third component of SU{2)i and 17- w hypercharge (for h’ptonic dmihlet 
Tx, = -1/2 and for quark doublet Tg = -1/6). 

For right fermions we identify Yr and Q: 


so that = |; Ydt, = = -1. 


Qh = Yr 


(3.7) 



QFT and Standard Model 


101 


The minimal way to introduce i7(l) interactions is to consider gauge boson that interacts with 
hypercharge Y 

= + (3.8) 

This is the gauge group of Minimal Standard Model 

SU{2)jr X Uil)Y . (3.9) 

Let A^^{x), i = 1, 2, 3 be gauge bosons of SU{2)x, and B^{x) - the gauge boson of U (1) group. The 
charged fields 

(3.10) 


A^==^{Al±iAl) 


can be identify with bosons. 

Photon Af^{x) in general is a combination of A^^ and Orthogonal combination represents 
another physical state, i.e. Z boson. Thus 


A, 




cosdw 

sin6w 


■ sin 9w 
cosOw 




(3.11) 


where 9w is a weak mixing angle. 

To break spontaneously SU{2)i x U{1)y group and to make masses to and Z bosons we 
need three Goldstone fields. The SU(2) doublet of Higgs particles 


if = 




Yh = 


1 


(3.12) 


can provide three Goldstone bosons after spontaneous breaking. In the MSM we use only one 
Higgs doublet. 

We have completed the construction of the MSM. Now we are ready to determine the masses 
of vector bosons mw, 'mz and phenomenological mixing angle 9w in terms of SU(2) coupling 
constant - U{1) coupling constant - gi and in terms of v.e.v. of Higgs field -rj. 

In the unitary gauge Higgs doublet has the form 


Hix) = 


0 


^{V + P{x)) 


Covariant derivative 


= igiYB^ix) ^ mT^A^ix) 
for the vacuum field Hyac is equal to 


(3.13) 

(3.14) 


D^H.ac = i-igilB^-mlr^^A;) 


V2g2W- 


0 

JL 

V2 


(3.15) 


tllr. 

2^/2^V -92AI+91B, 


The mass term for vector fields originates from {D^H)^D^H term in the Lagrangian. It looks 


like 


\{92r}?W^W; + \r)\92Al - g.B.f 


IS 


■ ' ij. ■ ■ II • g.i \j‘'--ii IJ.J (3.16) 

i,Prom this expression we conclude that the massive combination of and (i.e. .Z-boson) 

1 




VPi + 52 


{92AI - gxBfj) 


(3.17) 



102 


V' Novikov 


or that 

tgOw = g\lg-i. 

(3.18) 

From eq. (3.16) it follows that 

1 

mw = ^920 

(3.1!)) 

and 

viw — ''»■!! CO.s6^m/ 

(3.20) 


It is very interesting that Z boson should bo heavier than W l>os<)n! After spoiitaneons lireaking 
there still remains unbroken f/(l) symmetry that corresponds to masslt'ss photon. 

If we introduce electric charge e as a coupling constant of the irhoton we can relate //i,-. with e 
and cos 9w- Let us rewrite interaction of and ;is an interaction of .4,, and lields: 

[-ig 2 Tz)Al - igiYB^ = - •sin'' ewQ]Z„ + (-i){gi cos0w)QA,, (3.21) 

This is identically rewritten universal expression for covariani cl(Tivativ(‘. So ecp (3.21 ) is a|>plic‘abl<* 
to the left and right fennions and to the Higgs (loul)let, 

^From eq. (3.21) it follows immediately that 

e = //I cobOw = <72 (3.22) 

We complete the description of l)osonic sector of the SM. 

3.2 Weak interactions of leptons and quarks : 

Neutral Current. Request for new particles. 

Now we are ready to calculate the amplitude for tin* first physical i>rocess, for the; dticay of ft et/v. 
Charged currents Lagrangian for leptons looks lik(» 

^^Chari,ed = 75)f‘ 4* ...] (3.23) 

where the dots are for the similar terms with /x and r leptons. Feynman diagram for the /x-<i<H:ay 
is 



The amplitude for the decay can be read from this diagram and it is equal to 

Tin -)• eui/) = + 75 )/x)(e 7 „(l + 7r.)0 (3-24) 

The momentum transfer q from muonic current to electronic current is of the order of muonic 
mass rrifi. So if mw > the amplitude looks like a point-like interaction in Fermi theory. 

Tp^rmi = ( 3 . 25 ) 

Comparing these two presentations for the same amplitude we conclude that 

Gp _ 


( 3 . 26 ) 



QFT and Standard Model 


103 


Taking into account eq. (3.19) for mw we also get that v.e.v. rf is directly connected with Gj?: 


7? = [^/2GF]~^/^ = 246 GeV (3.27) 

Gf = G^, = 1.16639(2) • 10“® GeV-^ 

To fix remaining two fundamental parameters gi and g 2 we have to choose two other physical 
observables measured with the best accuracy. The choice is evident. They are the fine coupling 
constant a 

a-^ = -^ = 137.035985(61) (3.28) 

and the mass of .Z-boson 

mz = 91.187(2) GeV (3.29) 

To calculate gi and p 2 we first have to calculate the mixing angle 9w terms of Gp, Oi and 
mz- It is not diflacult exercise to show that 

sin^ ew COS^ ew = ■ (3-30) 

Substituting the values of the parameters from eqs. (3.27), (3.28) and (3.30) we get 

sin^ 9w = 0.2120 


\/4:7ra 

gi = = 0.34 


92 = 


cosdw 

y/iira 
sin 9 


= 0.66 


So we are ready for the first prediction in SM: we can calculate mw 


(3.31) 


(mwY^^^^ “ cos 9w = 80.94 GeV 


(3.32) 


that has to be compared with the current experimental value 

= 80.37(8) GeV (3.33) 

The deviation from theoretical number is only 0.6%, but this tiny number is equal to 8c7 
deviation. To explain the huge discrepancy we have to take into account radiative correction that 
have the scale of the few per mill. 

The old 4- fermionic point-like theory is the effective theory for momentum transfer much smaller 
than mw- In this sense the SM is generalization of the old theory. But SM also predicts the new 
phenomena that were unknown in V-A theory. This is the neutral currents. 

The effective 4-fermionic coupling of neutral currents is generated by Z boson exchange. 



At small momentum transfer it is local interaction with thq coupling constant equal to Gp cos^ 9w‘ 
Though this coupling is of the same order as Gf by some reasons the experimental search 
for neutral currents gave negative results for a long time and only in 1973 experimental groups at 
CERN observed neutral currents and provided the first experimental measurements of cos 9w- This 
measurement gave the possibility to calculate mw stnd mz theoretically (eqs. (3.19), (3.20)) with 



104 


V Novikov 


rather good accuracy. This estimate had been extremely helpful for the experimental discovery of 
W and Z bosons. 

Another great achievement of the SM was the request for new particles need('d for si'lf-consistency 
of the theory. In 1970 the set of the known particle included 2 gen<>rations erf leptons 


e 



c-n ,f^n 




and three quarks u, d and s that belong to the following SU{2)i, x f/(l)r representation 


, UK , du , .S/< 


(.•5.35) 


d! = dcos 6 c + « sin 

where 6c is the Cabibbo angle. This set of ejuarks produces flavor-changing .s f-t d neutral currenfs 

ZJ'liA ~ 

+ sin 9c cos d^\ds -f .sd]] (3.3(i) 

This was absolutely forbidden by experirmnital data. To save the SM Gliushow, llliopoulous 
and Maiani in 1970 introduced fourth c quark and the new SU{2)i, doublet 


( 

\ -dsmOc scoaOr 

As a result flavor-dianging neutral currents (lisai>pear and all ncsitral currents In^coine diagonal. 
This theoretical request for new i)arti(:le was satisfied by expcuiintuitid <li.s(X)V(U’y of f'-(|uark in 1974. 
The second generation had been coini)let(Kl. 

The T-lepton and beauty 6-quark were discovered experinuuitally in ini(l-Jl)7()s. The h(‘avii\st 
particle - top ^-quark was discovercul only two decadc^vS latc^r. Tliey corn|)o.s(‘ th(‘ third gtureration 
of matter. Third generation is absolutely necessary to d(‘scril)e the expeulinental data with high 
accuracy, i.e. in loop approxitnatiou. Moreover to fit experinunitiil data, th(‘ top (juark nuiuss .should 
be equal to 

Ttit = 180(5)i.^J Gev 

That is very close to the actual experimental vahu' of top mass. 


(3.37) 


3.3 Quark masses. 

In the Standard Model the mass term for the electron violates SU{2)i. Indeed thi.s term 


T/iece = -t- eicn] (3.38) 

transforms like doublet instead of being invariant. 

To preserve SU{2)i x 17(1)^ symmetry we have to use Higgs mechanism to gen(‘rat(' the masses 
for fermions. For example Yukawa coupling of L, cr and H is 517(2) x U{1)y invariant 

A£ = fe{LeR)H + h.c. = 


A 


iv + P(a:))ee = 


(3.39) 


f 

= mtSe + -^p{x)ee 

where p{x) is the field for physical Higgs in SM. From eq. (3.39) it follows that Yukawa coupling 
is proportional to m* 



QFT and Standaj:d Model 


105 


Notice that before this step the fields eiix) and eji{x) were absolutely different, i.e. they had 
different interaction with W and Z. Yukawa interaction unified these two Weyl spinors into one 
massive particle - electron. To give the mass to down quarks we can use the same type of Yukawa 
interaction 

ACm, = fd{QLdR)H (3.41) 

As for the mass of up quarks we need Higgs doublet with nonzero v.e.v. for up component of 
doublet. At that moment we can introduce new Higgs doublet. But in the case of SU (2) group 
complex conjugated fields 

H = (3.42) 

also behave like a member of 517(2) doublet. So we can use H to give mass to upper quark 

= h{QLdR)H + UQlur)H (3.43) 

This is the solution of problem of fermion mass in the case of one generation. 

For more than one generation we have to take into account quark mixing. For two generations 
this mixing can be described by one rotation angle (Cabibbo angle). For three generations one gets 
three independent rotations and one complex phase. Complex couplings cause violation of CP- 
invariance. Thus in the theory with tree generations we get mechanism for CP-breaking. Till now 
it remains unknown whether this mechanism is the only source of CP- violation in Nature. 

3.4 Triangle Anomaly. 

To have renormalizable theory of electroweak interactions it was absolutely crucial to start from the 
gauge invariant theory where gauge bosons interact with conserved Noether currents. Spontaneous 
breaking of symmetry does not spoil any symmetric relations between operators. They are exactly 
the same as in the original theory. The confusing notion of spontaneous breaking describes the 
nonlinear realization of the symmetry in the space of physical states. 

In the SM we operate both with vector and axial currents. For any axial currents 


(3.44) 


= 2im’®'75$ 

Thus naively axial current is conserved for massless fermiofis. But what is truejn Classical Field 
Theory can be not true in Quantum Field Theory. Indeed one-loop calculation of the divergence 
of axial current for electrons gives 


dfj,jl = 2im$e75'®'e + 




ip 


idF ^0 


(3.45) 


instead of eq. (3.44). The term FF originates from matrix element of between vacuum and 
two-photon states. 

So the axial current is not conserved even for m = 0. Not any classical symmetry can survive 
in Quantum Mechanics. This very interesting theoretical phenomenon has special name - triangle 
anomaly. 

In the SM there are plenty of axial currents that interact with gauge fields. Though fermions are 
massless (no mass terms in the Lagrangian) the anomaly can destroy the conservation of Noether 
currents and this will kill renormalizability. There is one possibility to save it. We see from eq. 
(3.45) that anomaly depends only on the ’’charge” of particle that is running inside loop. So if 
the total gauge current has different pieces it can happen that nonzero individual anomalies cancel 
each other for the total current. 



106 


V" Novikov 


This cancellation imposes some restrictions on the charges of quarks and leptons. Li‘t us diec'k 
this possibility. We will calculate the triangle matrix elements between fi(‘ld.s A], mid Tliere 
are two crossing diagrams that contribute to anomalous interaction l)(‘tween 3 gauges fitdds. 



Consider first the anomalous contribution of onc^ geiu'ration of matt(‘r. Ra.th<‘r simi)le calcula- 
tions demonstrate the following result. 

1) (A, A, A) and {A,B,B) anomalies are automatically disappc^iircul for l(‘pton doublet and for 
quark doublet separately. 

2) (B, A, A) anomaly is disappeared if 



+ , (3.46) 

i.e. quark contribution cancels lepton contribution only for this spindal redation l)(d,w(*en duirg<*s. 
This relation means that hydrogen atom has to be neutral! 

It is very interesting that renormalizability of tin* SM tab^s plac(* only if tin* charge of proton 
is opposite to the charge of electron. 

We can proceed further and consider othc^r anomalies. At tliat momrmt w(» hav(* to make som<‘ 
statement about i^r. Suppose first that it does not exist at all. In this aisv: 

3) Cancellation of (£?,£?, B) anomaly takes place only if 

Q.. = -1 , Q. = i); ^ , Q.i = ~l (3.47) 

(We suppose that QCD has SU{3)c symmetry.) 

4) Cancellation of {B -4 Khion + gluon) anomaly is automatic. 

5) Cancellation of (B -> graviton + graviton) anomaly tak(\s i>lacf only for the chargt; .samph* 
eq. (3.47). So we are able to fix the relative charges of h'lrtons and (piarks in thi.H cjise. 

If i/ji does exist anomalies 3) - 5) are disappeared automatically for any charge of iw’Utrino. 

If we suppose that the new generations are the (ixact replica of th<^ old one (only msisses are 
different, but the charges are the same) then we come to the same conclusion for each generation. 
If we allow to change the charges from one generation to .mother one the restrictions on the quark 
and lepton charges become weaker. 

In any case it is very interesting that renormalizability imposts restrictions on the property of 
matter fields. 

4 Loops in SM - the Window into New Physics. 

4.1 Z-physics at LEP and SLC. 

To test the predictions of the SM the huge "factories” of 2'-bosons (e+e" colliders) were constructed 
at CERN (LEPI) and at SLAG (SLC). Electrons and positrons in thus colliders collide at the cfuiter 
of mass energy equal to the Z-boson mass. The reactions that are studied can be presented in the 
form 


where 


e~^e Z f f 



QFT and Standard Model 


107 


{ i^z> invisible modes 

ll charged leptons 

qq hadrons. 

Near the dozen of independent observables were measured with fantastic precision of the order 
of lO""^ (10“® for the case of Z-boson mass). The scale of the radiative corrections in the SM is 
of the order of weak coupling constants: aw.z/'^ ~ 10“^ — 10“^. Therefore LEP-I and SLD data 
provide precision test of the SM as a renormalizable field theory, i.e. with loops included. 

The theoretical study of electroweak corrections in SM started in 1970’s and was elaborated by 
a number of theoretical groups. The deviations in theoretical calculations of different groups are 
by the order of magnitude smaller than the experimental uncertainties. 

By comparing the Tinvisibie with theoretical predictions for neutrino decays the result of fun- 
damental importance was established - the sequence of the generations with light neutrino is 
completed with number of generation 
Nf = i. 

The fit of the most recent experimental data (summer 1999) has very good quality: 

X^/n.d.f.=^15.0/U. 

We conclude that the SM gives the perfect description of Z physics. New physics can not improve 
the fit of LEP and SLC data. Thus the Standard Model has been confirmed up to the loop 
corrections. 

What is more important is that the loop corrections can be used to gather data on the not yet 
discovered particle. For instance, even before t-quark was discovered at Tevatron, its mass was 
predicted by analyzing the loops and LEP-SLC data. The hunting for virtual top quark is a very 
bright example of the collaboration of the theory and the experiment. 

4.2 Decoupling of heavy flavors from Low-Energy : Physics in QED and 
QCD. 

It is interesting to understand why in 1950’s nobody worried about the contribution of top quark 
(and other heavy flavors) into magnetic moment of the electron known with very high accuracy. The 
answer to this question is that for momenta q ^ me the corrections due to top quark are suppressed 
as a power of {m‘e/rn^) i.e. the contribution was negligible. In QED any heavy particles decouple 
from the low-energy observables. 

Consider the contribution of t-quark into QED observables. The only diagram with t-quarks 
in loop is the self-energy of the photon 

n,Aq)=imMM-q)}\0) (4.1) 

where j^iq) is the electromagnetic current of t-quark. Self energy has dimension 2: [II^i,] = m^. 
So one can expect that there exit terms of the order of 

This expectation is wrong in the case of conserved currents 

QfxjfMiq) = 0 (4.2) 

Indeed for conserved current the self-energy operator should be transversal = 0. Thus 

Equation (4.3) implies that the photon remains massless. The scalar function Il{q^) has dimension 
zero and the only possible contribution of t-quark into n{q^) can be written in the form 



108 


V Novikov 


U{q^) ~ «ln 


mf + <r 


where A is cut-off. The self-energy keeps the memory of heavy iiavor.s! 

The crucial step is renormalization. Consider the example of Coulomb scattfuiiig from C liaptiu 
I more carefully. If we take into account the infinite chain of .self-energy (•ontiil)otion into the 
photon propagator we get for amplitude 


'I'coulomh — 


e;i(A) 


7-(l + lW) 


At low we reproduce the Couloml)-law 


with 


T = 


pliya 


h>hy> 1 ^ I-I(Q) 

When we rewrite the amplitude (4.4) in terms of w(‘ g(d, 

rp ^ ' phytf _ 

- q^[l + n((/^) - II(l))] 

As a result: 

1) the dependence on cut-off A disappears 


(4.4) 


(4.5) 


(4.(i) 


(4.7) 


mr 


Ari = n(f/)-n(0)~ain ™ 

2) the contribution of heavy flavor is suppress(i(l as a power [q^ linfY 

An ~ -a -> 0. 

This is so called decoupling theorem. It works for the thftorie.s with conservcul vector currents. 


4.3 Non-decoupling of chiral matter ; 

Heavy Flavor contribution into electroweak observables. 

In the Standard Model the left components of f- and 6 -quarks belong to 51/ (2) w doublet r(‘pr(‘- 
sentation: Qi = (t^)' Therefore for the case when mt » mj, and for small energies E < rtit we 

have effectively the explicit violation of SU{2)w symmetry. For the virtual momenta ~ A ~ rrit 
theory looks like the old non-renormalizable theory. It mean that one-loop corrections <liverge 
quadratically 5i ~ ah? Im\, two-loop corrections diverge quartically 62 ~ So we (ex- 

pect that the corrections to the low-energy observables due to top contribution are of the order 
of 


~ awi 


62 ^ a^t^ 

where t = i.e. corrections are not suppressed, they grow with top mass mj. Heavy flavors 

are not decoupled from the low-energy observables for the chiral matter. As a result the radiative 



QFT and Standard Model 


109 


corrections in the SM are sensitive to the top contribution. Hunting for virtual top was very 
successful. 

A comparison of the experimental data with the result of theoretical calculation led to the 
prediction of the it-quark mass ?nt: 


mt = 180(5)1:^^ Gev, 

where the number in parentheses is the uncertainty due the uncertainties of the data. The center 
value corresponds to the assumption that mn = 300 Gev, the upper and lower ’’shifts” correspond 
to rriH = 1000 Gev and 60 Gev, respectively. 

The best fit of all observables gives 

(mt) 170-6 ± 4.9 Gev 

These numbers are in perfect agreement with the recent direct measurement of the top-quark 
mass by two collaboration at FNAL 


mt = 173.8 ± 5.0 Gev. 

The same strategy works for heavy unknown particles. Direct accelerator searches did not find 
yet any trace of New Physics. But loops with hypothetical new particles change the predictions 
of the SM. Thus the indirect way to get information on New Physics is to study the low-energy 
processes with very high accuracy. 

The simplest extension of the SM is the SM with new sequential generations of heavy quarks 
and leptons. Chiral matter does not decouple from low-energy observables even when the masses 
of particles become very large. If the masses of up and down components are very different we get 
effectively the violation of SU{2) symmetry and the corrections are large. They are similar to the 
top quark contribution. If the masses are heavy but degenerate the contribution into observables 
does not grow with mass but remains finite and nonzero. Accurate analysis demonstrates that one 
extra generation is already excluded by the current experimental data! 


4.4 Hunting for virtual Higgs. 

The direct search for Higgs at accelerators still gives negative result. The recent experimental lower 
bound for Higgs mass is near 90 Gev . In this situation it seems reasonable to look for virtual 
Higgs in loops. 

Consider the limit of very large Higgs boson mass m^. For E < mn we have 517(2) symmetric 
theory of massive gauge bosons, i.e. effectively non-renormalizable theory. Due to the gauge 
symmetry the leading divergence of the loop disappears. So the one-loop corrections diverge 
logarithmically 

A2 ^2 

Si ~ aw In — T ^ OLW In — ^ = aw In h 
two-loop corrections diverge quadratically 



Here h = rn\jlm\. 

This is the famous Veltman screening theorem. The weak dependence of radiative corrections 
on h results in a rather poor accuracy for derived from the precision data. The central value of 
mn from the fit is very unstable. Any tiny corrections or any change of the parameter can shift it 
by the order of magnitude. The one sigma upper bound is more reliable. According to the recent 
fit 


mu = 70.8l^^ Gev. 

It seems that the fit of the precision data is not the best place to hunt for Higgs boson. 



110 


V' Novikov 


4.5 Effective potential. Stability of the Universe and Bounds on //?// . 
The Higgs potential in the SM 

K-iW = 

has minima that corresponds to nonzero v.e.v. of field H: < H >vac= I^oop c onet tions < hani;e 
self-interactions of Higgs particles: 



The effective potential with loop corrections was calculattHi l>y Cok'inau and \\ (*in!H‘rg in 1973. 
In one-loop approximation it looks like 

- Va{H) = I 

where we have neglected by small contributions from fermions other than f-quark. Nott* tiiat dm* 
to Fermi-Dirac statistic the contribution of fermion loops has opposite* sign in <‘ompart*d to tin* 
bosonic loops. 

Corrections (4.26) become more important than tin* main (‘htssical pot(*ntial (-1.25) for v<*ry 
large field if. 

In one-loop approximation we get that the correction (4.26) Ims m^giitive sign if m// < ^rir/q. 
For this case the effective potential has no ground statr‘(see Fig.2). Thus (»vi‘n if our systi‘in was 
located first in the local minima at < // >- r; it will df'cay at t -4 rx) and tin* avmage value of 
field H will run to infinity. We know that nothing like that lias liapp<‘iuHi with our Univ(*rst‘ that 
is near 10^^ years old. So the stability of the Universe? imj)(>se?s strong e’onstraints on the masse.s of 
top and Higgs particles. 

To get more reliable results we have to improve? a little one-l() 0 |) formula {(M|. (4.2(5)). For 
large E one-loop logarithmic corrections A In if and am hi if are of tiu* sam<* ordi*!* as t tm ins, 
two-loop double-logarithmic terms are of the order of on<?-loop terms So all thi*s(‘ liigaritiiiiiii* 
terms should be taken into account. This t(?chnical problem is not very difficult - I’entirinalizafion 
group techniques help to sum up such corrections. The result is 

(■U7) 


in 


H 


+ 6m‘‘ 


w 


+ 1 ‘Ivif 


H ' 111 


//- 

j(j2 


■1.2<i) 


where X{t) and //(i) are running parameters and t = hi H/t/. For small value of t. {i.e. for small 
value of field H ) the running parameters do not nm far away from their elassieal values ami the 
effective potential is equal to the classical one with the accuracy of small radiativr* corrections. 
For large H we can forget about and the whole dynamics at large H is governed by running 
coupling constant X{t). There are different contributions into running of X{t) <'oming from the looiis 
with top quarks, vector bosons and Higgs boson itself. If the top (piark contribution dominates, i.e. 
the higgs coupling to top (i.e. the top mass) is large, X{t) changes sign and th<‘ vacuum he<-omes 
unstable. This is reformulation of the plmnomena that we had at one loop level. 

If the Higgs self-interaction dominates, i.e. the Higgs mass is largo, then the evolutiotJ cjf X{t) 
is similar to the evolution of coupling in tlie H** theory without other fi(‘Ul.s. It is known that in 
this case the behavior of X{t) is 

X(t) — ■ 

l-6A(to)ln^ 

and running coupling goes to infinity at some finite value of H. 


(4.28) 



QFT and Standard Model 


111 


15 

A=10 GeV 



Figure 2: 


This is the Landau pole in the running coupling constant. When initial condition A(to) (i-e. the 
value of Higgs mass ) increases the value of Landau pole goes done. If we substitute this running 
coupling constant into eq.(4.27) we get that the effective potential runs to infinity at this value of 
H as well. Such singular behavior of the coupling constant is unacceptable from the physical point 
of view. Indeed for any finite value of the bare coupling constant is equal to the running 

coupling X{t) at the cut-off A) we get that renormalized coupling constant (i.e. X{t) at low value 
of t) is equal to zero . It means that at low energy we get trivial free theory. This pathological 
theory seems to be unphysical. 

There is possibility to improve that bad behavior of X{t). If some new physics (i.e. new 
interactions and new particle) contribute into A(^) at scale below or near A the pole can disappear. 

If we believe that there are no new physics up to some scale (or that the theory can be treated 
perturbatively up to this scale) we have to push the position of Landau pole A (calculated in 
one-loop approximation eq. (4.28)) to higher scale. This impose upper bound on the value of 
Higgs mass. So we have bounded mn both from above and from below. This remarkable line of 
reasoning was invented by Cabibbo et al. in 1979. 

There are different choices for the parameter A. For example A can be of the order of Planck 
scale 

A - Api = 10^^ GeV , 
or of the order of Grand Unification Scale 

A ~ Aqut “ 10^^ GeV , 

or of the scale of the energy of the accelerator of the next generation 

A - 10^ - 10^ GeV . 

We have to keep in mind all these possibilities. It is evident that for the strongest assumption 
that new physics does not appear up to the Planck scale we should get the strongest upper bound 
for mp. 

To derive more quantitative results we have to solve differential equation for the running cou- 
pling constant X{t). The renormalization of X{t) depends on self-interaction coupling, on gauge 




112 


V Novikov 



Figure 3: 


coupling and on Yukawa coupling constants. So we hav(! to solvi; the whol(‘ system of coui>l<‘(i 
differential equations. This can be done numerically with the luilp of computer. The result of 
calculation for A = 10^® GeV is presented in the Fig.3. 

This is so to say the phase diagram in the plane me and rn//. Allow(*<l n*gion is located between 
two curves, the lower region corresponds to unstable va<nmm and for the parameters in the ujjper 
region Landau pole appears at the scale lower than A = 10*^’ Ch'V. For (>xperiniental valut! of 
mt cx 175 GeV the allowed region for rnn is very strong 

170 GeV < rnn < 190 GoV (A = GeV), 

For A ~ 10® GeV the upper bound is much weaker. 

5 Conclusions. 

The Quantum Field Theory has been dcvc>lop(;d as the ^'^mdam(^ntal 'riusiry of Nature. Later on it 
was realized that the renorinalizable QFT dest'ribe only the low-energy Htict, nat ions below cut-off 
and do not pretend to describe physics btjyond this cut-off. As w(! hav«t learn from the recimt 
development of theory it is not excluded that at short distances tlu* fundtimental theory is tlie 
string theory. Thus it may happen that QFT is not the final theory of nature. Nt‘verth(‘U*ss QFl' 
as an effective theory that governs low-energy phtmomena will remain with us forever. 

Acknowledgements 

I am grateful to my numerous friends and colleagues from whom I have got the main lessons 
in Quantum Field Theory. 



QFT and Standard Model 


113 


REFERENCES 

Steven Weinberg, ’’The Quantum Theory of Fields:Foundations”, VoLl, Cambridge Univ.Pr.,1995 
and ’’Quantum Theory of FieldsiModern Applications” ,Vol. 2, Cambridge Univ.Pr.,1996 . 

Michael E.Peskin, Daniel V.Schroeder, ”An Introduction to Quantum Field Theory”, Addison- 
Wesley Pub. Co., 1995 

Martinus Veltman, ” Diagrammatica:The Path to Feynman Rules”, Cambridge Uuiv.Pr.,1994 
Lev B.Okun, ’’Leptons and Quarks” ,Amsterdam,North-Holland,1982 



4. Broken Reflection Symmetries 


P.K. Kabir * 

Beams Physics Laboratory, University of Virginia, Charlottesville, VA USA 


Abstract 

Weak interactions do not possess thesyinnictries with resjiect to discrele transforinalioiis ol' 
space-inversion P, particle-antip;irticle exchange C, and motion-reversal 'F, which chaiactenze 
the strong interactions. Nonetheless, all iutermd.ions appear to he invariant with respect to the 
overall syniinetry corresponding to the simultaneous a(d,ion ICI* ol all thiMs* inversions, 1 he 
discovery of the broken symmetries - which are yet to he luidcrstood - is reviewMsi, log<'thcr 
with prospects for their futurt! study. 


1 Introduction 

More than forty years have (tlaixsttci since the sttirtling (iisc<)vi'ry[l,2], in exix'iiinent.s snggi-sted by 
Lee and Yang[3],that the phenomena of nuekuir Ixd-a-decay, and tin* <dos(*ly relattnl pror(‘.s.scH of |d 
and mu-meson decay, clearly distinguish h(*twe(*n left and right,. It was found that t he mirror^ image 
of the distribution of decay (*le<d,rons from a sample of p()lariz(*(l ( 4)halt"'6() nucl<‘i, or (»f |H)l<iri/.ed 
mu-rnesons, is distinctly different from the ol)s<*rv(‘d distribution. lh’c*viously, the assuni|)tiun - 
considered by many to be self-(‘vident - that the laws of physics shouhi not {iistinguish tnd ween 
left and right in any way, had been shown by Wign<‘r[4] to lead to th(* conservation (jf a (|uantum« 
mechanical entity called parity,- which is the (‘igenvahu* of tlu* spac(‘-inv<»rsion operator l\ which 
explained previously unexplained n^gulariti(\s of atomic, and lat(*r also nucl(*ar, spi‘rtros('cipy. I'hi* 
unexpected discovery, that the invariance of physical laws und(»r spa(’<*dnv(*r.si<}n is not an universal 
property of all interactions, was followed by further surprises. On the basis of theor(*tical arguments 
advanced by Lueders and Pauli[5,6], L(*e, Oehme and Yang[7] could (‘omdmh* that the large space- 
asymmetries found in beta-decay and in pi-irm d(K;ays would r(*(iiiire a simuIiantHJUS l)ri*akdowm 
of symmetry with respect to particle-antiparticie exc^hange C, whi(‘h had been an articlt* faitJi 
since the development of relativistic quantum theory l)y Dirac. This |)ernuts the a(*stln‘tically 
appealing hypothesis[8] of CP-invarianc(*, expressed elegantly l)y Landau through tlie stati*ment 
that physics seen through the mirror shows the pliysics (jf ant.i- rnatt<‘r. Tht» failure of C-symimdry 
in pi-rnu decays was subsequently confirmed directly, when it w<is shown that |>ions ()f oi>posit(* 
charges yield oppositely polarizexl muons. Lee, Oehme and Yang had idreatly noti'd that iH|uaIity 
of masses and lifetimes for particles and antiparticles could not bt* invoked as (*vi«k*nci‘ for 0- 
invariance because those are assured[7] by the more general requirement of TCI^-invariancc! alum*. 
That analysis proved to be highly prescient because it was discovered that the attractive* hyi>othesis 
of CP-irivariance, which accomodates P- and C-noninvariance while preserving <*om|>l<*te syiiiim*! ry 
between matter and antimatter, also could not be exact. The TCP theorem, discussi*d by Imeders 
and Pauli, then requires that T-invariance must also fail, a prediction which app(‘ars to liavc* 1)c»c*m 
confirmed recently. The hypothesis of TCP-in variance should, of course, also la* subjt*cte<i to 
experimental test. Further arguments to do so are based on recent speculations that the tx>m‘ct 
description of fundamental particle interactions require additional space-time dimensions beyond 
the usual four of Minkowski space-time. If these additional dimensions are "rolled up" within a 
small enough region at each point of the usual 4-dimensional space-time, their existence would not 
be detected until one probed the corresponding scales, just as the finite size and internal structure 


"‘Email : pkk(Olvirgima.edu 



Broken Reflection Symmetries 


115 


of atomic nuclei does not play any significant role in atomic and molecular phenomena. But these 
additional dimensions could introduce apparent non-locality in the usual 4-dimensional description, 
and thereby invalidate the premises of the Lueders-Pauli theorem. Refined tests of T(jP-invariance 
are possible by studying decays of neutral K-mesons, both singly and in correlated pairs as from 
</>-decay. Thus far, there has been no indication of any significant deviation from TCP-symmetry. 


2 Space-Inversion 

The deviation from space-reflection symmetry in nuclear beta-decays was first demonstrated by Wu, 
Ambler, Hayward, Hoppes and Hudson [1] who found that the beta-particles ( negative electrons ) 
emitted by a polarized source of Cobalt-60 nuclei are distributed anisotropically with respect to the 
nuclear polarization, with a strong preference to be emitted in a direction opposite to the nuclear 
spins. A closely related effect, first predicted by Landau[8], which was also later confirmed, is that 
these electrons should have their spins preferentially oriented against their direction of motion. 
If beta-decay interactions were reflection-invariant, oppositely-spinning electrons should appear 
as frequently, and there should not be any net polarization of the emitted electrons. Therefore, 
the detection of longitudinal polarization of beta-decay electrons is a clear demonstration of the 
breakdown of reflection-symmetry in nuclear beta-decay. 

Even before experiments revealed that beta-decays show large departures from mirror-symmetry, 
seen also in the pi-mu decay sequence, it had been suggested [9, 8, 10] that these asymmetries might 
be linked to special properties of the neutrino, which is emitted in each of these processes. The 
electrically uncharged neutrino is similar to the electron in all other respects except that its mass is 
much smaller- so small in fact that the measured value is not yet distinguishable from zero. Its exis- 
tence was conjectured by Pauli in 1930, to account for some puzzling features of beta-radioactivity, 
and incorporated by Fermi in his very successful phenomenological theory [11] of beta-decay. It 
is a property of massless spinning particles, and of such particles alone, that they can occur in 
states with a single spin orientation only: either directly along, or opposite to their direction of 
motion. By postulating that an emitted neutrino in beta-decay is always in one of these states[12]- 
which clearly has a handedness, depending on its spin-orientation - one would have a natural 
explanation of why beta-decay prefers one handedness over another. The observed departure from 
reflection-symmetry appeared to conform exactly to the hypothesis that the (anti)neutrinos emitted 
in beta-decay are purely right-handed. The corresponding neutrinos emitted in K-capture should 
then be purely left-handed. Although the neutrino itself had been directly detected only two years 
earlier, this prediction was strikingly confirmed in an experimental tour de force[13] within a year. 
Purthermore,the requirement that neutrinos are emitted only in such ’’chiral” states would assure 
that neutrinos remained strictly massless, even allowing for virtual higher-order processes which, 
in Fermi’s theory, led formally to infinitely divergent mass-shifts. Application of the condition of 
masslessness to the process of muon decay, which involves the emission of a pair of neutrinos, also 
led to the correct prediction of the energy-distribution of the decay electrons. 

Despite these successes, attribution of reflection-noninvariance to the special properties of the 
neutrino was unlikely to provide a complete explanation. One would still have to explain the failure 
of P-conservation in processes without neutrinos, in particular the coexisting 2-pion and 3-pion 
decays of K-mesons, the original problem which led to the proposal of P-nonconservation. Fein- 
berg[14] had already noted that the successful predictions for muon decay were in fact ’’accidental” 
because even the expectation of parity nonconservation in that process was not required unless 
one recognized the existence of two distinct neutrinos, a fact which was not invoked by any of the 
authors of Refs. [9, 8, 10]. We now know that the correct form of the Fermi Interaction is obtained by 
requiring each of the participating fermions to interact through the same chiral projection[15,16], 
and the neutrino does not have a privileged role. Indeed the current view is that neutrinos, which 
are believed to occur in at least three varieties, probably have small masses, which can be invoked 
to explain various outstanding problems [17]. 

The Fermi theory is now known to be the low-energy limit of a theory in which charged vec- 
tor mesons, with masses almost a hundred times greater than the proton mass, are exchanged 



116 


P K Kabir 


b6tw6Gn chargG- changing currents fornied from chiral fermions. Why a particulai ( hiiality is ( ho- 
sen, and why the participating fermion fields are certain specific superpositions of the observed 
fermions ( with well-defined masses ) are important questions for which no satisfactory explanation 
is available at present. The currently favoured description, that these are manifestations of si)oii- 
taneous symmetry-breaking, accomodates the observed parameters while renoiiucing any «it,tein|)t 
to explain them. 


3 The Failed Hypothesis of CP Symmetry 

From the viewpoint of symmetry, the hypothesis of CP-invariance, advocated most forc'tdully l>y 
Landau[8], offered an extremely attractive solution to the problem posed by th(‘ discovm*y tliat 
beta-decay and other weak interactions distinguish between left and right. The diffcsrenc'e b(*tween 
the beta-decay distribution from a polarized Cobalt-60 source, and its mirror-image, makers no 
distinction between left and right if there is exact CP-symmetry. Although beta-decay may nut 
appear to be mirror-symmetric, CP-invariance would restore mirror-symmetry in a larger sense. By 
requiring that the mirror-image of any physical process should represent the courst* of th(‘ ri*lat(!<i 
phenomenon where each particle is replaced by its corresponding anti-particle, th(‘ symnu^try of 
space would be completely restored and any observed difference would reflect only the r(‘ci|)ro<’al 
difference in behaviour between particles and their corresponding antiparticles. It would not l>e 
possible to give an absolute definition of left or right if one did not know whetluu’ one* Wiis di*aling 
with particles or antiparticles. Conversely, one would not be able to tell wlujtlier oiu* was ol>serving 
a group of particles or their corresponding antiparticles, solely by observing thenr int<‘r«u*t.ions. From 
1957 to 1964, all deviations from refiection-symmetry observed in weak dc'cay processes ap|>{*iir(*d 
to conform to the broader requirement of CP-invariance. There was an additional iheurt^tical 
argument in favour of CP-symmetry. According to the theoretical expectation of I'CI^-invariance 
[ see next Section], CP-invariance would necessarily require T-invariance. Thus symm(‘try 
would not only guarantee the symmetry of space and the absolute e(|uality of left mul right, 
and of matter and antimatter, it would also require the validity of exact T-invarianci*. W^igner 
had shown[4] that many regularities,- notably the degeneracy discovered by Kramers for energy 
eigenstates of electrons in potential fields of arbitrary complexity - could be understood most simply 
under this hypothesis. Furthermore, the discovery of parity non-conservation invalidated the usual 
argument for the non-occurrence of static electric dipole moments for atoms and molecules, but 
Landau showed[8] that T-invariance was sufficient to forbid their occurrence, wiujther parity was 
conserved or not. Consequently, the failure thus far to detect electric dipole morrients - in tlie case 
of the neutron at a scale 10“^^ smaller than the known extension of its charge and magmitization 
- could be considered as support for this hypothesis. 

Therefore, it came as another great surprise when, in 1964, experiments revealed that CP could 
not be sustained as an exact symmetry of Nature. To explain the discovery, we must first make 
a brief digression about neutral K-mesons. Among the first ” strange” particles to be discovered 
was a neutral particle with a mass of about 500MeV/c^ arid a lifetime of about whicdi 

decayed into a pair of pi- mesons. To account for the slowness of its decay, Gell-Mann Mid Nishijima 
independently proposed the notion of ’’strangeness” , an additive property similar to electric charge, 
appropriately assigned to the and other ’’strange” particles, which is conserved in the strong 
interactions which produce K-mesons, but not in the weak interactions which lead to their decay. 
In the Gell-Mann-Nishijima scheme, the antiparticle must have strangeness opposite to tliat of 
the and therefore be a physically distinct state; TCP-invariance would require that it be degen- 
erate with . Since, by hypothesis, strangeness is not conserved in weak decay processes, the 
decay of a Jf would be accompanied by mixing with the state: decay cannot be correctly 

described without taking account of the degenerate state with which it mixes. 

Gell-Mann and Pais [18] showed that decays of neutral K-mesons are more conveniently de- 
scribed in terms of the coherent superpositions ( particle symbols represent the corresponding 
quantum states): 


K^^{K^±K^)/V2 


( 1 ) 



Broken Reflection Symmetries 


117 


which are eigenstates of CP, with eigenvalues ±1, respectively, if CP is an exact symmetry. In- 
variance of all interactions under CP-transformation would require CP to be conserved. Then the 
CP- even eigenstate K+ could decay only to CP-even eigenstates while K- would correspondingly 
decay only to CP-odd eigenstates. Production of a if ^ in a nuclear collision should be viewed as 
creation of the coherent superposition: 

K° = iK++K^)/V2-, ( 2 ) 

production would likewise correspond to creation of a similar superposition, in which the phase 
of K-. is reversed. Since a 27r s-state is necessarily a CP-even state, the 6^ mode of 

if^’s originally observed must represent decays of the if+ component; the if- component which, 
according to Eq.(2), is produced with equal probability as if^., can not decay via 27r modes and 
would therefore presumably decay more slowly via 3-body modes. Confirmation of the occurrence 
of such longer-lived neutral K-mesons, a spectacular prediction of quantum mechanics, with a mean 
life of about 6.10“®sec decaying via 3-body decay channels, was one of the first triumphs of the 
particle-mixture theory. 

To the surprise of almost everyone, Christenson, Cronin, Fitch and Turlay discovered[19] that 
about one in every 500 long-lived K-mesons also decays into a pair of 7r-mesons, contradicting 
the theoretical expectation from the hypothesis of CP-invariance. Many interpretations, which 
sought to explain this observation without giving up CP-symmetry, had to be abandoned after 
it was found[20] that the 27r state arising from decay of the long-lived state Kl is quantum- 
mechanically identical with the one resulting from decay of the short-lived state Ks<, because it 
interferes constructively with it. This shows that the long-lived state Kl contains a CP-even 
admixture, as conjectured by the original authors, or that the decay permits a CP-odd initial state 
to make a transition to a CP-even final state. In either case, CP-conservation must be abandoned. 
The most direct and striking demonstration of the breakdown of CP-symmetry comes from a 
comparison of the time-distribution of 2rc decays from a sample of initial with that from a 
sample of initial A"®’s. Eq.(2) shows that an initial state is an equal superposition of AT-j. and 
AT- 

(which,in lowest approximation, may be identified with the short- and long-lived kaons states Ks 
and Kli respectively ). Since the Ks decays with a lifetime rs much shorter than that of AT^, 
the amplitude of A"+ relative to AT-, will rapidly reduce to a level such that the relatively feeble(77 
times smaller) Ki lir decay amplitude becomes comparable to the Ks contribution. This 

should happen at r 2rs\logQr}\ ^ 12 ts for \7]\ = 2.10“^. For such times, the replacement of 

K^ by A"®, which reverses the relative sign between the two terms in Eq.(2), should result in a 
dramatic change of the decay rate as constructive interference replaces destructive interference 
and viceversa. Such a clearcut difference between the time-distributions of K^ and 

decays is indeed observed. 

Because of the degeneracy between A^ and states, an extremely weak CP-noninvariant 
interaction suffices to provide the small CP-even admixture e in the long-lived kaon state: 

A^ oc A- -h €A+ (3) 

which is required to account for the observed departures from CP-symmetry. Until very recently, 
it appeared that the parameter e ( whose phase, under reasonable assumptions, is fixed by other 
measured parameters) was the only known measure of CP-noninvariance, which had not been seen 
anywhere outside the neutral K-meson system. The KTeV group at Fermilab has now confirmed 
an earlier report[21] from CERN that the ratio in Kl decays is 1.7% higher than 

the corresponding ratio for Ks decays. If the admixture e of the CP-even component A+ in 
the long-lived neutral kaon state Kl were the only cause of the observed CP-noninvariance, the 

decays should be the same as that for A5. Thus, there must be a 
’’direct” contribution to 27r decays of neutral K-mesons, which also rules out explanations which 
attribute the observed CP-asymmetry to external effects, such as the influence of hypothetical 
long-range fields which affect A^ and A° differently, or interactions with ambient CP-asymmetric 
media, such as a neutrino ”sea”. The detection of ’’direct” CP-noninvariant interactions greatly 



118 


P K Knbir 


encourages the search for CP-asymmetric effects outside the neutral fcion system. Such seardies 
hanre been conducted extensively since the original discovery but, in the absence of theoretical 
guidance, these could not be more than gropings in the dark, and did not lead to any furt-her 
evidence for CP-noninvariance. As long as e sufficed to describe the oliserved (JP-noninvariaiu‘(‘ 
in neutral kaon decays, no further CP-noninvariant efTects could be predicted with any assuraiu‘(‘. 

Nonetheless, the discovery of new kinds of strangeness, now called charm and lieauty , led to 

the prediction and discovery of mesons bearing these attributes. Among tlu^se wer(‘ mor(‘ iuassiv(‘ 
homologues of the called and ^^respectively, whose transformations and d(‘cays sh()iild Ix' 
similar to those of neutral kaons, and be described by a similar formalism. In partn’iilar, mixing of 
po with appears to be well-established[22] and the currently-fevoured paraim'tri/.ation of weak™ 
interaction currents suggests that relatively large CP-asymnietries could occur in B-dt'cays. In t he 
Standard Model, which satisfactorily describes most high-energy phenomena, Cfl^-n()ninvarianc(Ms 
accomodated through mixing of the charged chiral currents which generate wtuik int(*ract ions, kor 
currents formed from three pairs of quark fields, Kobayashi and Maskiwa found[23] that most 
general unitary mixing involves a complex phase angle, whose presence k'ads to (?i^-iH)ninvariant 
effects. Jarlskog[24] and others showed that, if this is tlie only source of CP-n(>iniivarirmce, all 
such observables are proportional to a single phase-invariant parameter J characteristic of th(‘ 


mixing-matrix which describes the weak currents. In principle, measurement of tlu* (/P-violating 
parameter e in neutral K-meson decay should then fix J ; all other CP-asymmetric efh'cts in ^ and 
other decays would then be predicted. Unfortunately, present inability to reliably Ccilculatc* (dhrts 
of strong interactions,- arising from Quantum Chromodynamics in the Standard Model - previmts 
the full implementation of this programme. Within the limits of such unc(‘rtainti(\s, Ifiere atv 
suggestions [22] of CP-nonconserving effects significantly larger than in Dtxiic’aicx! B- 

factories - which are colliders designed to produce copious B-B pairs- liav(‘ h(*(‘n ('onstructed 
to search for these processes, and results may be expected soon. A (/>-fa('tory ciilkn! DA<^NE is 
already operating at Frascati; KiKs decays provide a tmicpie nKuuis to study pure Ks 
samples, whose CP-violating decays have not yet been measured. 


4 TCP-Invariance 

Even before the discovery that weah interactions do not respect tiie reii(Hd,i()n-.syimn(‘t,ri(‘s which 
characterize strong and electromagnetic interactions, several authors[25,5,()] had noted tlit' exis- 
tence of a more general kind of reflection-symmetry which appears to have; a stronger th(‘ort’t ieal 
foundation. Named ’’strong reflection” by Schwinger, it corresponds to simultaiu'ons inversion 
of space- and time-coordinates. But for the fact that the time-coordinate is singled out. by its 
opposite sign in the relativistic metric, such a transformation could bo aocxunplisln-d by ji pure 
rotation in the even-dimensional Minkowski space, and thus be an element of tlu; cliuss of projHT 
Lorentz transformations. By extending the concept of Lorentz-invariance to include invarianc(> 
with respect to Lorentz transformations for complex values of the relative velocity ( which deter- 
mines the rotation angle ), the requirement of Lorentz-invariance would include invariaiux- under 
strong reflection. Under strong reflection, energy (whirii transforms like the time-coordinati') 
must change sign but, in relativity theory, negative energies correspond to ri(‘gative inertia and 
therefore make no sense. Following Dirac and Feynman,the problem is avoided l)y reinterpreting 
these states of negative energy, found by applying strong reflection to particle-statcis of positive en- 
ergy, as corresponding positive-energy states of anfi-particles. Thus, strong reflection can lx* given 
an unambiguous meaning as the operation of CPT or TCP'( the order of the factor.s is irreU'vant 
since all the inversions are taken to act together ), viz. inversion of space- and time^-coordinates 
simultaneously with particle-antiparticle conjugation. Local Lorentz-invariant quantum Held the- 
ories can readily be constructed which do not have symmetry with respect to one or more of thti 
discrete transformations P, C, or T but all of them would necessarily be invariant with r(;.sp(!ct to 
the combined operation of strong reflection 0 = TCP . The existence of anti-particles - predictf'd 
by Dirac - is a consequence not of C-invariance ( which may or may not hold in particular cij-ses ) 
of relativistic quantum field theory, as originally supposed, but of the more general requirement of 



Broken Reflection Symnaetries 


119 


TCP-invariance. 

In addition to the requirement that particles and antiparticles have equal masses and lifetimes, 
TCP-in variance requires them to have identical electromagnetic properties, apart from obvious 
changes of sign. Thus far, all measurements are consistent with these conditions. By far the most 
stringent test comes from a comparison of and masses, by applying a relation given by 
Bell[26]. Any difference between the masses or decay- widths of and would be reflected in 
the complex mass-difference between Kl and Ks and the composition of the Kl and Ks states 
through: 

A - A = (Al - A5)(€s - Cl), (4) 

if one neglects terms higher than quadratic in the e’s. If one does not assume TCP-invariance, ei 
replaces the e in Eq.(3), while €5 is defined by a corresponding equation for Ks- Using the best 
available data, the Particle Data Group reports[27] a limit for the LHS which cannot exceed 10“"^^ 
times the kaon mass! This is by far the best test of TCP-invariance available at present. 


5 Question of T-Invariance 

According to the hypothesis of TCP-invariance, any deviation from CP-symmetry must be accom- 
panied by a compensating departure from T-symmetry. Therefore, it is important both as a check 
of the TCP-theorem and on general principles, to search for departures from T-invariance which, 
in some way or other, must correspond to deviations from reciprocity. 

As long as all known departures from CP-symmetry are restricted to the neutral K-meson 
system, where they can be described[28] by the parameter c, it follows that the only T-noninvariant 
effects which can be predicted with assurance are those dependent on the same parameter. In the 
TCP-invariant description, ^ transitions do not satisfy reciprocity [29], and a corresponding 
asymmetry has been found [30] between leptonic decays of and with the expected sign and 
magnitude. The observed effect does not constitute a direct test of reciprocity, thus one cannot 
assert unambiguously[31] that T-noninvariance has been directly demonstrated, but the cost of 
defending T-invariance would necessarily demand the sacrifice of TCP-symmetry. Fortunately, 
this question can be settled by direct experimental test, and an answer should be available soon. 


6 Conclusions 

Departures from symmetry under space-reflection P, and under particle-antiparticle exchange C, 
characterize the weak interactions. The hypothesis of CP-invariance, or symmetry under combined 
inversion, would restore the symmetry of space, and the equivalence of particles and antiparticles, 
just as Pasteur’s discovery of stereoisomerism restored the symmetry between right and left in an 
earlier era. The conclusive discovery of CP-noninvariance in neutral K-meson decays demolishes 
that possibility. The fact that the deviation from CP-symmetry is small, and observed thus far only 
in neutral kaon decays, may be related to the possibility that CP-noninvariant interactions require 
the participation of much more massive particles. Such particles could contribute only virtually, 
and relatively weakly by the rules of quantum theory, in the decay of lighter particles. Their effects 
are detectable in decays of neutral kaons because of the special circumstance of degeneracy, 

required by TCP-invariance. The corresponding deviation from T-invariance, demanded by CP- 
noninvariance, appears to be confirmed. Larger CP-noninvariant effects may occur in decays of 
the more massive B mesons. Detailed studies of neutral kaon mixing, both for individual particles 
and for kaon pairs, could provide improved tests of TCP. 

Prom a fundamental point of view, the broken mirror-symmetries, with respect to P,C and 
T, remain completely mysterious. Thus far, there is no evidence of any deviation from symmetry 
under the combined transformation TCP of ’’strong reflection”, whose basis in relativistic quantum 
field theory furnishes the only available explanation for the occurence of antiparticles. 



120 


P K Kabiv 


References 

[1] Wu, Ambler, Hayward, Hoppes and Hudson, Phys. Rev. 105,1413 (1957). 

[2] Garwin, Lederman and Weinreich, Phys. Rev. 105,1415 (1957). 

[3] T.D. Lee and C.N. Yang, Phys. Rev. 104,256 (1956). 

[4] E.P. Wigner, Group Theory and Applications to Quantum Mechanics, Acadeinir Pre-ss, New 
York, 1959. 

[5] G.Lueders, Kgl. Dan. Vid. Selskab,Mat-Fys. Med. 28, No. 5 (1954). 

[6] W. Pauli,in ’’Niels Bohr and the Development of Physics”, Pergarnon, London, 1955. 

[7] Lee, Oehme and Yang, Phys. Rev. 106,340 (1957). 

[8] L.D. Landau, Nucl. Phys. 3,127 (1957). 

[9] A. Salam, Nuovo Cimento 5, 299 (1957). 

[10] T.D. Lee and C.N. Yang, Phys. Rev. 105, 1671 (1957). 

[11] E. Fermi, Zeits. f. Phys. 88,161 (1934). 

[12] Such a theory was proposed by Weyl in the early days of quantum theory, but was rejected 
by Pauli precisely because it was not consistent with the requirement of reflection-invariance. 

[13] Goldhaber, Grodzins and Sunyar, Phys. Rev. 109,1015(1958). 

[14] G. Feinberg, private communication, 1957. 

[15] E.C.G. Sudarshan and R.E. Marshak, Proc. Padua- Venice Conf. 1957. 

[16] R.P. Feynman and M. Gell-Mann, Phys. Rev. 109,193 (1958). 

[17] Among these are the solar neutrino deficit, viz. the fact that the rate of neutrino reactions 
observed on Earth, induced by neutrinos produced by nuclear reactions in the solar interior, 
is about one-third of the expected rate, and the anomalous ratio of high-energy muons to 
electrons produced deep underground, presumably by neutrinos from decay of mesons cniated 
by cosmic-rays collisions with the atmosphere. 

[18] M. Gell-Mann and A. Pais, Phys. Rev. 97,1387 (1955). 

[19] Christenson, Cronin, Fitch and Turlay, Phys. Rev. Lett. 13, 138 (1964). 

[20] Fitch, Ross, Russ, and Vernon, Phys. Rev. Lett. 15,73 (1965). 

[21] NA31 Group. G.D.Barr et al. Phys. Lett.B317, 233 (1993). 

[22] See Y. Nir and H. Quinn, Ann. Rev. Nucl.and Part. Sc 42,211 (1992). 

[23] M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49,652 (1973). 

[24] C.Jarlskog, Phys. Rev. Lett.55,1039 (1985). 

[25] J. Schwinger, Phys. Rev. 82,914 (1951). 

[26] J.S. BeU, Proc. Oxford Inti. Conf. on El.Particles 1965, R.G.Moorhouse et al. eds. Rutherford 
High Energy Laboratory, 1966. 

[27] Particle Data Group. Eur.Phys.J. C3, 1 (1998). 



Broken Reflection Symmetries 


121 


[28] The additional parameter required to describe the effect reported in Ref.21 is too small to 
substantially affect the following conclusion. 

[29] RK. Kabir, Phys. Rev, D2,540 (1970). 

[30] CPLEAR Group. A. Angelopoulos et al. Phys.Lett.B444, 43 (1998). 

[31] RK. Kabir, Phys. Lett. B459,335 (1999). 



5. Dynamics Of Symmetry Breaking Out Of 
Equilibrium: From Condensed Matter To QCD 
And The Early Universe* 

D. Boyanovsky^“’*^ and H.J. de Vega^^'’“^ 

Department of Physics and Astronomy, 

University of Pittsburgh, Pittsburgii, PA 15260 USA 
(b) LPTHELt 

Universite Pierreet Marie Curie (Paris VI) et Denis Didi'rot (Paris \ II), 
Tour 16, ler. etage,4. Place Jussieu 75252 Paris, Cc'dex 05, hVance 


Abstract 

The dynamics of symmetry breaking during out of equilibrium phtus(‘ transitions is a 
of great importance in many disciplines, from condeiLsecl matter to particle* physic's ami <‘arly 
Universe cosmology with definite experimental and observational im|)a(*t. In th<‘S(* nol(*s wt* 
provide a summary of the relevant aspects of the dynamics of symmetry l)r<*aking in many 
different fields with emphasis on the experimental realizations. In condens(*d matt<*r wc* addrchss 
the dynamics of phase ordering, the emergence of condensates, coars(*ning and dynainic'al 
scaling. In QCD the possibility of disoriented chiral condensates of i)ions (*mi‘rging during a 
strongly out of equilibrium phase transition is discussed. Finally we edaborate* on the* dynamic's 
of phase ordering in phase transitions in the Early Universe, in particular the enuTgence* 
of condensates and scaling in FEW cosmologies. We mention some* expt»rimentai c‘librts in 
different fields that study this wide ranging phenomena and offer a (juantitativc* t h(*or<*tic al 
description both at the phenomenological level in condenscid matter introducing the* sc'aling 
hypothesis as well as at a microscopic level in quantum field theoriCvS. The* einerg<‘nci* of 
semiclassical condensates and a dynamical length scale is shown in detail, in <|uantum fiedd 
theory this length scale is constrained by causality. 

The large N limit provides a natural bridge to compare the solutions in tlie different 
settings and to establish similarities and differences. 11.10.-z;11.15.Pg;11.3().Qc 


1 Phase Ordering Dynamics: an interdisciplinary fascinat- 
ing problem 

The dynamics of non-equilibrium phase transitions and the ordering process that cjccurs until the 
system reaches a broken symmetry equilibrium state play an important role in niaiiy different 
areas. In condensed matter physics binary fluids, ferromagnets, superfluids, and liqui<l crystiils, to 
name a few, are examples of systems in which the dynamics of phase trjmsitions out of (‘(luilihriuni 
are studied experimentally. 

Experiments in these systems have provided a solid basis for the description of the dyn}unic.s 
of phase ordering: in binary fluids or alloys upon a sudden temperature drop below the critical 
temperature, the two fluids begin to separate, regions of different fluid concentratioins are separated 
by domain walls. In superfluids, rapid cooling leads to a network of vortices and in liquid crystals 
to many different topological defects. 


* Email: BOYAN@vsml.cis.pitt.edu 
taboratoire Associ6 au CNRS UMR 7589. 



Symmetry Breaking out of Equilibrium 


123 


Current and future measurements of Cosmic Microwave Background anisotropies as well as the 
formation of large scale structures in the universe provide distinct evidence for phase transitions 
during inflation and after [1]. At even lower energies, available with current and forthcoming accel- 
erators, the Relativistic Heavy Ion Collider (RHIC) at Brookhaven and the Large Hadron Collider 
(LHC) at Cern the phase transitions predicted by the theory of strong interactions, Quantum 
Chromodynamics (QCD) could occur out of equilibrium via the formation of coherent condensates 
of low energy Pions. 

These phase transitions and the associated processes which often take place out of equilibrium 
will be an experimental telltale of the chiral phase transition of QCD [5]. 

Whereas the GUT phase transition took place when the Universe was about 10“^^ seconds 
old and the temperature about and the EW phase transition occured when the Universe 

was seconds old and with a temperature the QCD phase transition took place at 

about 10“® seconds after the Big Bang, when the temperature was a mere lO^^AT. This tempera- 
ture range will be probed at RHIC and LHC^ithin the next very few years. The basic problem 
of describing the process of phase ordering, the competition between different broken symmetry 
states and the formation and evolution of condensates on the way towards reaching equilibrium 
is common to. all of these situations and fields. The tools, however, are necessarily very differ- 
ent: whereas ferromagnets, binary fluids or alloys etc, can be described via a phenomenological 
(stochastic) classical description, certainly in quantum field theory a microscopic formulation must 
be provided. In these lectures we describe a program to include ideas from condensed matter to 
the realm of quantum field theory, to describe the non-equilibrium dynamics of symmetry breaking 
and the process of phase separation and phase ordering on a range of time and spatial scales of 
unprecedented resolution (for example in QCD the time scales < seconds, spatial scales 

< meters, in cosmology the time scales are of order seconds and spatial scales smaller 
than meters) that require a full quantum field theoretical description. 

We begin the excursion into these timely fields by first providing a brief quantitative description 
of the relevant setting and whenever possible the experimental situation associated with them in 
three main areas: Condensed Matter, Ultrarelativistic Heavy Ion Collisions and Early Universe 
Cosmology. This qualitative discussion will be followed by a more quantitative description of some 
of the main theoretical ideas, techniques and tools. 

1,1 Condensed Matter: 

A description of phase transitions and critical phenomena in equilibrium begins by recognizing 
an order parameter which is a thermodynamic ensemble average of a macroscopic variable that 
determines the different macroscopic states of the system. For example in ferromagnets the order 
parameter is the average magnetization, above a critical temperature it vanishes and it is non- 
zero below the critical temperature, in superfluids is the condensate density, in superconductors 
the density of Cooper pairs, etc. [6]. Phase transitions in equilibrium are fairly well understood 
and described by the theory of critical phenomena[6] which combined with the renormalization 
group provides a very succesful description of phase transitions. The theory of critical phenomena 
and the renormalization group provide a very robust description of universality classes: many 
systems that are very diff’erent behave similarly near critical points, these universality classes are 
divided by for example the dimensionality of the order parameter, the dimensionality of space, 
and the symmetries of the underlying microscopic Hamiltonian. An important concept in critical 
phenomena is the correlation length, take for example a spin one-half ferromagnet, the microscopic 
Hamiltonian has an up-down symmetry, the energy remains the same if all spins are flipped. Focus 
at a particular point of the sample where the spin is up. The correlation length is the distance 
over which the spins are correlated, i.e. the distance from this up-spin over which the neighboring 
spins are also up. As the critical temperature is reached from above this correlation length grows 
reaching a macroscopic size (diverging) at the critical temperature. As the system is cooled below 
the critical temperature a phase transition occurs: there appears a net overall magnetization and 
at low temperatures all spins are either up or down, the up-down symmetry is spontaneously 
broken [6]. This phase transition occurs in equilibrium when the microscopic relaxation time scales 



124 


Boyanovsky and de Vega 


ar6 shorter than the time scale of cooling the system, thus at all times the syvSteni is iii kx al 
thermodyfiCLTnic equilibriuTn. At very high temperatures typically the disoidcued pha-sc^ pievaiLs, all 
spins are oriented at random and the average magnetization vanishes. As the ( litic al tempeiature 
is reached regions of correlated spins appear and become of macroscopic size* as the* <‘orr(dation 
length diverges and the spin system begins to order. In this region the therinodyiianiic' (iu<intitii‘s 
become insensitive to the short distance details such as crystalline lattices, lattice si)a('ing and 
the nature and strength of the interaction between spins as the physics is detennimHi by the 
correlation of spins over large distances. Near the critical point, the short distance^ kaigt li scalt's 
are irrelevant for macroscopic phenomena and long-wavelength physics is completcdy (i(‘tt*rinin(‘d 
by the correlation length ((T). Macroscopic thermodynamic quantities and susc(‘ptibilities n<‘ar 
the critical temperature only depend on the length scale ^(T). 

This is the basis of the static scaling hypothesis which is confirmed experimc'ntally in a mde 
variety of systems and is theoretically supported by the renormalization group ai>pr()ach to criti('al 
phenomena[6]. The static critical phenomena associated with second order [ihase transitions t hat 
occur in local thermodynamic equilibrium is fairly well understood via the renormalization group 
(and other alternative approaches) [6]. 

Consider the alternative scenario in which a ferromagnet is held at very high t,(uni)(‘ratur(» in tlie 
disordered phase and suddenly it is cooled below the critical temperature on tim<* scales shorten’ 
than those associated with relaxational phenomena. Now the spin system must evolve* towards 
the ordered phase far away from equilibrium. Unlike the case of static (local tiH*rmo<iynaini(* 
equilibrium) critical phenomena, the case of out of equilibrium pliase transitions re^tpiirt* a novel 
set of ideas and tools to describe the dynamics of the process of phases ordering. 

There is now a large body of theoretical and experimental work in ph«ise ordering dyuami<\s 
in condensed matter systems[7]-[10]. Although ultimately the tools to study similar questions in 
quantum mechanical many body systems will be different, the main physical features to <i(*s<'ril)e are 
basically the same: as the system cools down suddenly below the critical ternparattirc* corr«‘liit ed 
regions (of spins in a ferromagnet or of condensate in a Bose superfliuid) begin to form. Thi*si* 
correlated regions are separated by ‘walls’ or other structurevS. Inside these regions an ordt*red 
phase exists which eventually grows in time to become macroscopic in size. Before att(‘mpting to 
describe the manner in which a given system orders after being cooled througli a t)lnLse transition 
an understanding of the relevant time scales is required. Two important time scales detf*rniim‘ if 
the transition occurs in or out of equilibrium: the relaxation time of long wavelengtli fluctuations 
(since these are the ones that order) Treiik) and the inverse of the cooling rate t.ooi = mrm. 
If Trei{k) « tcool then these wavelengths are in local thermodynamical equilibrium (LTE), but if 
Tniik) » tcool these wavelengths fall out of LTE and freeze out, for these the phase* t.ransition 
occurs in a quenched manner. These modes do not have time to adjust locally to the teunperatun* 
change and for them the transition from a high temperature phase to a low tempeirature one occur 
instantaneously. This description was presented by Zurek[ll] analysing the emergence of defect 
networks after a quenched phase transition. Whereas the short wavelength modes are rapidly 
thermalized (typically by collisions) the long-wavelength modes with k « 1/^(T) with i{T) the 
correlation length (in the disordered phase) become critically slowed down i.e. their rehixation 
time becomes extremely long near the critical point. As T ->• T+ the long wavelength modes 
relax very slowly, they fall out of LTE and any finite cooling rate causes them to undergo a 
‘quenched’ non-equilibrium phase transition. As the system is quenched from T > (disordered 
phase) to T « To (ordered phase) ordering does not occur instantaneously. The length scale of 
the ordered regions grows in time (after some initial transients) as the different broken symim^try 
phases compete to select the final equilibrium state. A dynamical length scale ^(t) typically (unergiw 
whiA is interpreted as the size of the correlated regions, this dynamical correlation length grows 
in time to become macroscopically large[7, 8, 9, 10]. Just as in static critical phenomena, the 
emergence of this dynamical correlation length leads to the dynamical scaling hypothesis, that the 
approach to equilibrium and the kinetics of phase ordering is solely determined by this length scale. 

Experiments in binary fluids for example, study the growth of these correlated regions by light 
scattering[12] much in the same manner as the onset of ferromagnetism is studied via neutron 
scattering. The growth of the domains, characterized by the dynamical length scale results in 



Symmetry Breaking out of Equilibrium 


125 


that as a function of time the scattering of light becomes stronger for longer wavelengths i.e. 
smaller wave- vectors, until eventually at very long times a Bragg peak at zero momentum emerges 
signaling the macroscopic ordering of the system. This growth of domain structures during the 
dynamical process of phase ordering is referred to as ‘coarsening’ [7, 8, 9, 10]. This mechanism with 
a clear experimental realization in condensed matter is at the heart of the Kibble-Zurek[2, 11, 13] 
scenario of the dynamics of symmetry breaking in cosmological phase transitions [2, 11, 13]. In 
this scenario a ‘network’ of defects emerges right after a phase transition that occurred strongly 
out of equilibrium with a density of about one defect per initial correlation length. This network 
eventually evolves and in computer simulations a scaling regime is observed to emerge[2]. 

The dynamics of phase ordering had been studied in liquid crystals whose symmetry group is 
rather similar to that of particle physics models. The experiments produced non-equilibrium phase 
transitions both by suddenly varying the pressure and the temperature (pressure and temperature 
quenches) [14, 15] and confirmed at least in a qualitative manner the main features described by this 
scenario of dynamics of symmetry breaking. More recently a new set of experiments had sought 
to provide a more detailed picture of the dynamics of symmetry breaking phase transitions out of 
equilibrium and to simulate in the laboratory what is thought to be the situation in cosmological 
phase transitions. Original experiments focused on studying the dynamics of phase ordering after a 
pressure quench in superfluid ^He[16] by measuring second sound (i.e. entropy disturbances) which 
only propagate in the superfluid component (the broken symmetry phase). The interpretation 
of results in these experiments were overshadowed by induced turbulence during the quench and 
spurious phase separation due to imperfections of the walls. More recently a new set of experiments 
were carried out that seem to lead to cleaner interpretations. 

In these ingenious experiments[17] a small sample of superfluid ^He, whose order parameter 
has a group structure very similar to some particle physics models, was heated locally by neutron 
irradiation via the nuclear reaction 

+ He‘‘+p+ 764 keV 

the energy released heats a small portion of the liquid Helium into the normal state and rapid 
diffusion of the quasiparticles cools this region back into the superfluid phase very rapidly, thus 
providing a quench from a normal (disordered) phase into the superfluid (ordered) phase. The 
resulting domain structure is then studied via NMR and a qualitative agreement with the picture 
of the symmetry breaking dynamics seem to emerge from these experiments. Thus these beautiful 
experiments in condensed matter provide controlled experimental framework to test the concepts 
associated with the dynamics of symmetry breaking. 

These ideas of the emergence of correlated regions that grow in time and become macroscopic 
during non-equilibrium phase transitions has been recently invoked as a potential signature of the 
chiral phase transition in QCD, the theory of the strong interactions. 

1.2 Chiral symmetry breaking in QCD and disoriented chiral conden- 
sates 

Quantum Chromodynamics (QCD) is the theory of strong interactions, with the fundamental 
degrees of freedom being the quarks and gluons. Quarks, however are confined inside hadrons 
and do not exist as individual, isolated particles in vacuum. However there is now a wealth of 
theoretical evidence including very convincing lattice results that indicate that at temperatures 
above T 150Mev quarks and gluons become free and form a quark-gluon plasma. The lattice 
results are supported qualitatively and quantitatively by phenomenological models [18]. In fact 
the evidence supports the picture of two phase transitions: the quark-gluon plasma or confining- 
deconfining phase transition in which the quarks and gluons become confined into hadrons and the 
chiral phase transition that leads to the low energy description in terms of pions. The low energy 
limit of QCD is dominated by the lightest up and down quarks ti, d with masses nriu « 5Mev; rud « 
7 ” lOMev. These mass scales are much smaller than the natural scale of QCD, A.qcd ^ lOOMev 
at which QCD becomes strongly coupled. In the limit of vanishing up and down quark masses, the 



126 


Boyimovsky and de \vga 


QCD Hamiltonian possesses a global chiral symmetry corresponding to rotating independently the 
right and left handed components of the spinors that describe the quark fields. Tliis syninu'try is 
SU{2)l ®SU{2)r and in the low energy world is spontaneously broken down to SU(2)i,+n w’ith 
the charged and neutral pion isospin triplet being the (quasi) Goldstone bo.sons associated witli 
the breakdown of this symmetry. The small mass of the pions (ss 135M(!v), on tlu' hadronic scah’ 
is a result of the small mass of the up and down quarks on the QCh> scale, which breaks exi)licitly 
chiral symmetry. This is the chiral phase transition. The lattic-e results seem to indicate t liaf th(> 
two transitions, deconfining and chiral symmetry breaking are very close in teini>ei at.ure and may 
actually happen at the same temperature[18]. 

Whereas the deconfining phase transition does not seem to be cliaracd.erizc'd by a natural order 
parameter, the chiral transition is described by the non-vanishing of the chiral condensate < f/(/ > 
with q = (tZ, d) with < ••• > refering to the vacuum expectation value or the thermodynamic 
ensemble average at finite temperature. Although this transition(s) have l,aken place when the 
Universe was at a temperature of ISOMev about 10~® seconds after the Big Bang, t.lu' Hidativistic 
Heavy Ion Collider (RHIC) at Brookhaven to start operation at the (uid-of 19!)!) and the Large 
Hadron Collider at Cern (around 2004) will probe this transiticms liy colliding heavy ions. 

RHIC will accelerate and collide from protons up to 250 Gev and ions of up to the heaviest 
nuclei with collision energies of about 100 Gcv per nutdeon for Au nuclei. The phenoirn'iion of 
nuclear transparency observed in nucleon-nucleon collisions leads to tlu; conchision that al)out half 
the energy of the collision is carried away by the nuclei and about half the en(*rgy is deposited 
in the ‘central region’ of the collision. Most of the baryons an* carric'd by the receding tniclei 
leaving this central region almost baryon free. Estimates of the energy density in tins region give 
e « 3 - 5Gev/fm^ corresponding to temperatures T « 200Mev. Immediately afttu the collision, 
hard scattering of quarks and gluons dominate the dynamics tlie gluons have mean-fre<“ paths 
estimated to be of order O.Sfm and the quarks 1 — 2fm (the difhnauice is mainly due to color 
factors) hence after a time of the order of about Ifm/c the plasma is thermaliz(‘d. 

The next stage of the dynamics is described by Bjorken’s hydrodynamic i)ictur(‘[I!)]. When the 
plasma has achieved local thermodynamic equilibrium and for wavt'kuigths longer than t he imuin 
free paths, the plasma can be described as a strongly coupled fluid and a hyclrodynamic de.scription 
is suitable. The essential ingredients in a hydrodynamics description arc; i) the; fluid is dc'scribc’d 
by a local four velocity vector = 7 (l,u);u'‘u^ = 1, the energy momentum tensor is that of an 
homogeneous and isotropic fluid 


= {e + Py^u'' - Pg'^"' 

with e , P the energy density and pressure respectively. The dynamics is then completcdy detc>r- 
mined by conservation laws: i) baryon number, ii) energy momentum and by local thc'rmodyamic 
equilibrium relations. The resulting picture of this hydrodynamic evolution is that the triasma 
expands and cools adiabatically and the temperature drops in time with the following law 

T{r) = T, (5)*^ 

with cs the adiabatic sound speed, To > 200Mev and tq « Ifm/c. For a radiation dominated fluid 
c? = 1/3. 

As the critical temperature for the chiral phase transition is reached from above the long- 
wavelength fluctuations of the cdiiral order parameter are expected to become critically slowed 
down. If this is the case the chiral phase transition can occur in a quenched manner and strongly out 
of equilibrium. Under these circumstances, Wilczek and Rajagopal argued that large domains in 
which the chiral order parameter could be disoriented with respect to the vacuum could appear[2t)] . 
These domains are coherent pion condensates that form after the non-equilibrium phase transition 
much in the same manner as the correlated domains in condensed matter systems. These pion 
condensates decay, the neutral pion decays into two photons and the charged pions decay into 
muons. The pions can then be reconstructed and therefore these disoriented chiral condensates 
could lead to experimentally observable anomalies in the ratio of the number of neutral to charged 



Symmetry Breaking out of Equilibrium 


127 


pions. In isospin symmetric condensates the probability for finding a ratio R of neutral to total 
(neutral plus charged) is P(i?) oc J(i? - 1/3) (for large number of pions) whereas a disoriented 
chiral condensate leads to P(R) oc l/\/P[21]. 

The possibility of formation of disoriented chiral condensates had been previously conjectured 
by Bjorken[21] as a potential explanation of CENTAURO events[22], these are cosmic rays events 
with anomalous neutral to charged pion ratios. This possibility of a distinct signature associated 
with the chiral phase transition sparked an intense theoretical effort [23, 24]. Several experimental 
searches are trying to find evidence for this pion condensates or disoriented chiral condensates at 
CERN-SPS (experiment WA98[25]) at the Tevatron at Fermilab (Minimax experiment [26]), the 
PHENIX detector at RHIC[27] at BNL can provide an event-by-event analysis of this potential can- 
didates and the ALICE experiment scheduled at CERN-LHC includes the detector CASTOR[28] 
that will be studying CENTAURO type events. 

These disoriented chiral condensates are coherent pion domains and describe the same type of 
phenomenon of the ferromagnetic domains in quenched ferromagnets or the superfiuid domains in 
He superfluids as described previously. If these condensates are realized during a non-equilibrium 
stage of the chiral phase transition, they could lead to important experimental probes of this 
transition and hopefully will be amenable of detection at the RHIC and LHC. 

There is an important difference in the dynamics of the chiral phase transition in the Early 
Universe and at Ultrarelativistic Heavy Ion Colliders. In the Early Universe, the chiral phase 
transition occured at a temperature of 150Mev when the Universe was about 10”^ seconds old in 
the radiation dominated era. The size of the Universe at that time was about 10 Km which is 
much larger than the mean-free path of quarks and gluons 10""^^m and the time scale for cooling 
TfT ^ 10”®secs is much longer than the relaxation time scale r^ei ^ 10~^^secs. Therefore in the 
Early Universe the confining and chiral phase transition occured in equilibrium. These time and 
length scales must be compared to those in heavy ion collisions: the time scale for cooling from 
hydrodynamic expansion is few fm/c and the relaxation time scale near phase transitions could be 
longer and comparable to the lifetime of the quark-gluon plasma, furthermore current numerical 
estimates determine that the region in which the QGP is formed is about 20 fm. Hence there is 
a possibility that these phase transitions could be out of equilibrium in heavy ion collisions and 
that novel phenomena associated with the process of phase ordering and the emergence of pion 
condensates could be important experimental signatures of the chiral transition. 

1,3 Early Universe Cosmology: 

The COBE satellite mission revolutionized the field of Cosmology. The discovery of temperature 
fluctuations in the Cosmic Microwave Background (CMB) of 30/rK imprinted on a blackbody 
spectrum at 2.7iK provides supporting evidence for the main ideas that seek to explain the small 
inhomogeneities that gave rise to large scale structure formation[3, 4]. 

Current theoretical ideas maintain that the universe underwent a period accelerated expansion 
called inflation at an energy scale determined by Grand Unified Theories ^ lO^^Gev. During this 
period the size of the universe grew by a factor allowing this inflationary scenario to solve the 
main difficulties of the standard Big Bang Cosmology [3, 4]. Small quantum fluctuations that were 
present during this epoch of inflation soon became causally disconnected and therefore unaffected 
by microphysical processes. These fluctuations became in causal contact again at a much later 
stage of the cosmological evolution, when the Universe was basically dominated by matter. Small 
fluctuations begin to grow under gravitational instability when they become causally connected 
again but after the epoch of radiation-matter equality at a temperature of about lOeV and redshift 
z « 10"^ [3]. The COBE observations are sensitive to those fluctuations that have established causal 
contact again after the epoch of recombination about 300000 years after the Big Bang at a redshift 
of about « 1100. Therefore observations of the CMB allow to obtain information on the spectrum 
of primordial quantum fluctuations that were present shortly after the Big Bang. 

The first scenario for inflation relied on a supercooled phase transition[3]. Recent detailed 
studies of the dynamics of phase transitions in early universe cosmology[29] allow a reliable calcu- 
lation of the dynamics including backreaction effects on the metric and a self-consistent evolution 



128 


Boyanovsky and do Vega 


of classical gravity and quantum fields. This approach allows to extract the powei spec tun n of 
the primordial perturbations of the metric arising from the quantum fluctuations in the niattei 
fields. These fluctuations are directly related to those of the temperature of the CMB at the 
scale of recombination and correspond to the Sachs- Wolff plateau in the powc^r si)(M‘truni nic‘asur(‘(i 
by C0BE[3, 4]. It is found[29] that the growth of correlated domains affer a superc'ocilcHl i)h<ise 
transition of second order (no metastability) favors a power spectrum witli more |)owc‘r on long 
wavelengths[29] as a consequence of the process of phase ordering. This is a ix>me(^imce of the* 
unstabilities associated with the early stages of the phase ordering dynamics. Tliis yviV inmev 
spectrum is consistent with the results of COBE for the temperature anisotruihes, providtHi tliat, 
the couplings of the matter field are fine tuned [29]. 

After inflation the universe underwent several phase transitions. As in all i)has(‘ transition, 
fluctuations grow large and may have induced density fluctuations leaving imprints in the (’MB 
as well as acting as seeds of structure formation. How important are tliesc^ eff(‘cts on otjservable 
quantities is still under investigation [30]. 

Most works about phase transitions after inflation made the assumption that (l(‘ft‘ct.s govern such 
transitions. There is a very important difference between the fluctuations in tii(‘ infliitioimry and 
the defects scenarios. In inflation, the quantum fluctuations becorric causally disconm*ct <*d[3, ‘Ij ami 
therefore their evolution is very simple until they become causally connected again la^i'ause tliesc 
fluctuations are not influenced by microphysical processes during the period of acfunsa.! evolution. 
Contrary to this dynamics, defects are always causal and are constantly influenccHl by micro|>hysical 
processes. Their evolution must be followed dynamically from the time at which a nt»twork of 
defects is formed, at a GUT scale, all the way up to the tiiruj scah^ at which tlu\v rtnsult in 
the formation of large scale structure-several billion years later!. ()l)viously this is an (mornious 
dynamical range, however, detailed computer simulations reveal that a scaling solutit)!! i*mt*rgcs (ftjr 
details see[2]) determined by a dynamical length scale. The results of numeri(*<il Htudi(\s 8Uggi‘Ht that 
this dynamical length scale is completely determined by the size of the <’ausul horizon at a given 
time (see the later section on Cosmology for details on causal horizons). Howevtu’, recent works i>n 
defects give a clear indication that current models of defects contradict CMB nu*astin‘nients [31] 

The emergence of this length scale through the dynamical process of phase or<i(U’iiig is exiicily 
the same that has been previously discussed within the context of (!ondens(‘d matter systeirii. 
Current ground based and balloon borne experiments along with large scale survi*ys and futurf* 
satellite missions will provide a flood of data that will support or falsify current thts)r(‘ti(‘al ideas on 
large scale structure formation and temperature anisotropies. Thus an important thiH)r(dicaI effort 
goes in providing reliable predictions on the power spectrum of primordial <iuaritum fluctuations. 
It is a tantalizing possibility that these cosmological observations could provide a dtdinite iwident'e 
for cosmological phase transitions. 

1.4 ...Therefore 

We have seen in detail that the dynamics of phase ordering and evolution after non-eijuilihrium 
phase transitions are of fundamental importance in a wide range of energies from m(‘V, in Con- 
densed Matter, through Gev in the physics of the Quark Gluon Plasma and the Chiral Phase 
Transition all the way to GUT’s (10^^ Gev) in Early Universe Cosmology. An important tc*chnical 
aspect in the study of these phenomena is their non-perturhative nature: in a rapid phase transition 
(of typical second order without metastable states) small amplitude long-wavelength fluctuations 
become unstable (this will be understood in detail below) and grow in time. The amplitude of tliese 
fluctuations must grow until they begin to sample the broken symmetry states of thermodynamic 
equilibrium. 

Having discussed in some detail the importance of the dynamics of symmetry brc?aking phase 
transitions out of equilibrium within important settings and their experimental study, we now 
provide some of the technical aspects that help clarify the phenomena and their quantitative study. 
We begin by describing a phenomenological approach to phase ordering kinetics in condensed 
matter systems, highlighting the important ingredients and concepts. We then move on to furnish 
a quantitative approach to the study of the non-equilibrium dynamics in quantum field theory to 



Symmetry Breaking out of Equilibrium 


129 


compare some striking similarities to condensed matter and also to contrast some important and 
relevant differences. The main point for delving into some technical details is to emphasize many 
robust features of the dynamics of symmetry breaking and phase ordering, 

• The early stages of phase ordering are determined by linear (spinodal) instabilities. Long- 
wavelength fluctuations become unstable and grow. 

• The emergence of a dynamical length scale. This scale represents the average size of the or- 
dered domains and grows in time, eventually at asymptotically long times becoming macro- 
scopically large. 

• Associated with this dynamical length scale there is dynamical scaling, asymptotically this 
length scale determines the behavior of correlation functions. 

• Coarsening; the growth in time of this correlation length translates in that the peak of 
the power spectrum moves towards longer wavelength, resulting in a sharp ‘Bragg peak’ at 
asymptotically long times. This Bragg peak reflects the onset of condensates corresponding 
to ordered regions of macroscopic size. 

As we will see in detail, the phenomenological description in condensed matter systems is very 
different from the microscopic description in quantum field theory. Nevertheless we find that 
despite these important differences the above features are fairly robust and common to all of the 
situations studied. Only an excursion into the technical details can reveal in full force these very 
important and remarkable features. 

2 Phenomenology of phase ordering dynamics in Condensed 
Matter: 

The phenomenological description of phase ordering kinetics begins with a coarse grained local 
free energy functional of a (coarse grained) local order parameter M{f)[7, 8] which determines 
the equilibrium states. In Ising-like systems this M{f) is the local magnetization (averaged over 
many lattice sites), in binary fluids or alloys it is the local concentration difference, in supercon- 
ductors is the local gap, in superfluids is the condensate fraction etc. The typical free energy is 
(phenomenologically) of the Landau-Ginzburg form: 

F[M] = yd^f|i[VM(x)p + y[M(f)]| 

V[M] = lriT)M^ + jM^ ; r{T) = roiT - Tc) (1) 

Fig. 1 depicts V[M] for T > Tc and T <Tc- The equilibrium states for T <% correspond to 
the broken symmetry states with M = ±Mo(T') with 

r 0 ioxT>Tc 

Mo(T) - I ^ 

Below the critical temperature the potential V[M] features a non-convex region with < 

0 for 

-Ms{T) < M < M,{T) ; M,{T) = ^{T - Tc)^ (T < T,) (3) 

this region is called the spinodal region and corresponds to thermodynamically unstable states. 

The lines Ms{T) vs. T and Mo(T) vs. T [see eq.(2)] are known as the classical spinodal and 
coexistence lines respectively. Fig. 2 displays the classical spinodal and coexistence curves for the 
potential V[M] in (1). 

The states between the spinodal and coexistence lines are metastable (in mean-field theory). 

As the system is cooled below Tc into the unstable region inside the spinodal, the equilibrium state 



130 


Boynnovsky nnd de Vega 


of the system is a coexistence of phases separated by domains and the concentration of i>has(‘K is 
determined by the Maxwell construction and the lever rule. 

Question: How to describe the dynamics of the phase transition and tlu' process of phase 

separation? 

Answer: A phenomenological but experimentally succesful description, Time Dc-pendent Ginzhur 

Landau theory (TDGL) where the basic ingredient is Langevin dynamics[7]-[10] 


dM {f, t) 
dt 




SF[M] 

SM{r,t) 




(• 1 ) 


with T){r,t) a stochastic noise term, which is typically assumed to In? white (unc.orreluted) and 
Gaussian and obeying the fluctuation-dissipation theorem: 


= ; {7,{r,t.)) = 0 (f.) 

the averages {• • •) ari over the Gaussian distribution function of the noise. There are t,w(» import ant 
cases to distinguish: NCOP: Non-conserved order parameter, with T = To a constant independent 
of space, time and order parameter, and which can be absorbed in a rescaling of time, GOP: 
Conserved order parameter with 

r[fi = -To 

where To could depend on the order parameter, but here we will restrict the discus.siun to t.he c.-use 
where it is a constant. In this latter case the average over the noise of the Langi'vin ecpiat iou ran 
be written as a conservation law 


dM 

dt 

J = 




V ■ j + r] => (f fM (f, t)) = I 

SF[My 


-To- 


SM 


= Vf/u 


(fl) 


where g is recognized as the chemical potential. Examples of the NCOP art? the magm'tization 
in ferromagnets, the gap in superconductors and the condensate density in suirerfiuids (the total 
particle number is conserved but not the condensate fraction), of the COl’: the concentration 
difference in binary fluids or alloys. For a quencdi from T > deep into the low temperaturt* phiuse 
r -t 0 the thermal fluctuations are suppressed after the quench and the noise term is irrelevant. 
In this situation of experimental relevance of a deep quench the dynamics is now described by a 
deterministic equation of motion, 
for NCOP: 


for COP: 


dM_ 5F{M] 
dt ~ ° 6M 


dM 

dt 


= 


Fo 


5F[M] 

SM 


(7) 

( 8 ) 


which is known as the Cahn-Hilliard equation[7, 8]. In both cases the ecpiations of motion are 
purely diffusive 


- f ^3 dM{f,t) 

dt J SM{f,t) dt ~ 


fd^ri^f NCOP 


(9) 


and in both cases -^ < 0. Thus, the energy is always diminishing and there is no possibility of 
increasing the free energy. Thus overbarrier thermal activation cannot be described in the absence 
of thermal noise, which is clear since thermal activation is mediated by large thermal fluctuations. 
The fact that this phenomenological description is purely dissipative with an ever diminishing free 
energy is one of the fundamental differences with the quantum field theory description studied in 
the next sections. 



Symmetry Breaking out of Equilibrium 


131 


2.1 Critical slowing down in NCOP: 

Critical slowing down of long-wavelength fluctuations is built in the TDGL description. Consider 
the case of NCOP and linearize the TDGL equation above the critical temperature for small 
amplitude fluctuations near M = 0. Neglecting the noise term for the moment and taking the 
Fourier transform of the small amplitude fluctuations we find 

_ro [k-^ + ro(r - T,)] m, (t) (10) 

showing that long-wavelength small amplitud fluctuations relax to equilibrium = 0 on a time 
scale given by 

noc[k^ + ro(T-T,)]~^ (11) 

As r -> the long- wavelength modes are critically slowed down and relax to equilibrium on very 
long time scales. Therefore a TDGL description leads to the conclusion that if the cooling rate is 
finite, the long-wavelength modes will fall out of LTE and become quenched. As the temperature 
falls below the critical, these modes will become unstable and will grow exponentially. 


2.2 Linear instability analysis: 

Let us consider now the situation for T << Tc and neglect the thermal noise. The early time 
evolution after the quench is obtained by linearizing the TDGL equation around a homogeneous 
mean field solution Mo{t). Writing 

M(r, t) = Mo{t) + -4 E W " (12) 

^ jk#0 


where fi is the volume of the system, and considering only the linear term in the fluctuations mk{t) 
the linearized dynamics is the following: COP: for Mo{t) the conservation gives 


dMojt) 

dt 


since Mo is the volume integral of the order parameter [see eq.(6)] and for the fluctuations we 
obtain 


dmkjt) 

dt 


= uj{k) mk{t) 


uj{k) = -Fo k^ 


k^ + 


d^V[M] 

5M2 


Mo 


(13) 


In the spinodal region 


d^V\M\ 


Mo 


< 0 there is a band of unstable wave vectors < 


d^V\M] 


am 

Mo 


for 


which the frequencies are positive and the fluctuations away from the mean field grow exponentially. 
NCOP: separate the k ^0 from the fc = 0 in the linearized equation of motion: 


dMo{t) ^ dV[M] 

dt ° dM 


dmk{t) 

dt 


-Fo 


SF[M] 


5M 


mk{t) = ~Fo 


-iMo(t) 




d^V[M] 

aM2 


Mo 


(14) 

(15) 


whereas the first equation (14) determines that Mo{t) rolls down the potential hill towards the 
equilibrium solution, the second equation also displays the linear instabilities for the same band 
of wave vectors as in the COP in the spinodal region \Mo{t)\ < Ms{T) [see eq. (3)] for which 
the fluctuations grow exponentially in time. Thus in the linearized approximation for both NCOP 
and the COP the spinodal instabilities are manifest as exponentially growing fluctuations. These 
instabilities are the hallmark of the process of phase separation and are the early time indications 
of the formation and growth of correlated regions which will be understood in an exactly solvable 
example below. 



132 


Boyanovsky and de V(^ga 


2.3 The scaling hypothesis: dynamical length scales for ordering 

The process of ordering is described by the system developing ordered regions or domains that aiv. 
separated by walls or other type of defects. The experimental probe to study the domain structure 
and the emergence of long range correlations is the equal time pair coiielation funt tion 

C{f,t) = {Mif,t)M(6,t)) (16) 

where {■ ■ ■) stands for the statistical ensemble average in the initial state (or average over t he noise 
in the initial state before the quench) and will become clear(er) below. It is e(mvenieiit. to (>xpand 
the order parameter in Fourier components 

^ k 

and to consider the spatial Fourier transform of the pair correlation function 

Sik,t) = (17) 


known as the structure factor or power spectrum which is experimentally meas\ir<Kl l)y neutron 
(in ferromagnets) or light scattering (in binary fluids)[12]. The scaling hypotliesis intro(iuc(*.s a 
dynamical length scale L{t) that describes the typical scale of a correlated region and proposr's 


that 


Cif,t) = / S(k,t) = L<^{t) g{kL{t)) 


(18) 


where d is the spatial dimensionality and / and g are scaling functions. Ultinuit(‘ly scaling is 
confirmed by experiments and numerical simulations and theoretically it from a renormal- 

ization group approach to dynamical critical phenomena which provides a calculational framework 
to extract the scaling functions and the deviations from scaling behavior[7]. This s(’aling hyj)Dth- 
esis describes the process of phase ordering as the formation of ordered ‘domains’ or (correlated 
regions of typical spatial size L(t). For NCOP typical growth laws arci L{t) ^ (witli some 

systems showing weak logarithmic corrections) and L{t) « for scalar arul ^ for vector 

order parameter in the COP case[7j 9, 10]. 


2.4 An exactly solvable (and relevant) example: the Large N limit 

We consider the case where the order parameter has A/'-components find transforms <is a vector un- 
der rotations in an N-dimensional Euclidean space, i.e. M(r, t) = (Afi (f, t), M 2 (f, • ' • , t)). 

For = 1 an example is the Ising model, for 77 = 2 superfluids or superconductors (where the 
components are the real and imaginary part of the condensate fraction or the complex gap respec- 
tively), 77 = 3 is the spin one Heisenberg antiferromagnet, etc. For 77 = 1 the topological defects 
are domain walls (topological in one spatial dimension), for 77 = 2 they are vortices in (i = 2 and 
vortex lines in d = 3, for AT = (i = 3 the topological defects are monopoles or skyrmions wliich 
are possible excitations in Quantum Hall systems and also appear in nematic liquid crystals [2]. 
For 77 00 and fixed d no topological defects exist. However the (!xact solution of the large N 

model gives insight and is in fairly good agreement with growth laws for fixed 77 systems which 
had been studied experimentally and numerically [7, 10]. In cosmological space-times it has been 
implemented to study the collapse of texture-like configurations[l, 32, 33] (see later). In quantum 
field theory the non-equilibrium dynamics of phase transitions has been studied in Minkowsky and 
cosmological space-times [24, 34, 35, 36, 37, 38]. The large N limit is an exactly solvable limit 
that serves as a testing ground for establishing the fundamental concepts and that can be sys- 
tematically improved in a consistent 1/77 expansion. It provides a consistent formulation which is 
non-perturbative^ renormalizable and numerically implementable and has recently been invoked in 
novel studies of non-equilibrium dynamics in quantum spin glasses and disordered systems [39]. 

The exact solution for the dynamics in the large N limit, being available both in the condensed 
matter TDGL description of phase ordering kinetics and in Quantum Field Theory in Minkowsky 



Symmetry Breaking out of Equilibrium 


133 


and Cosmological space times, allow us to compare directly the physics of phase ordering in these 
situations. Thus we begin by implementing this scheme in the NCOP case for the TDGL descrip- 
tion. 

What is the (• • •) in the equations of the previous section?: consider that before the quench 
the system in in equilibrium in the disordered phase aiT »Tc and with a very short correlation 
length (^(r) ^ l/T). The ensemble average in this initial state is therefore 

(M^(f,0)) = 0 (19) 

where A specifies the initial correlation. Now consider a critical quench where the system is rapidly 
cooled through the phase transition to almost zero temperature but in the absence of explicit 
symmetry breaking fields (for example a magnetic field) . The average of the order parameter will 
remain zero through the process of spinodal decomposition and phase ordering. During the initial 
stages, linear instabilities will grow exponentially with ml(t) « m^(0) ; uj{k) = - r(0) 

for < r(0) and at early times 

{mUt)m^ M)) « A (20) 

hence fluctuations begin to grow exponentially and eventually will sample the broken symmetry 
states and the exponential growth must shut-off. The large N limit is implemented by writing the 
potential term in the free energy as 

K[M] = + A(m2)2 . (21) 

where A is kept finite in the large N limit. We will focus on the NCOP case with a quench to zero 
temperature and rescale the order parameter, time and space as 

M = Y 1? ; f (0) To t = T ; \/r^ x = z (22) 

after which the evolution equation for the NCOP case becomes 



where derivatives are now with respect to the rescaled variables. The large N limit is solved by 
implementing a Hartree-like factorization [7] 

ff (rf) ^ N (rji) no sum over i (24) 

Then for each component the NCOP equation becomes 

^ = [V^ + M^it)]r,i (25) 

M^t) = !-(»??> (26) 

the eq.(26) is a self-consistent condition that must be solved simultaneously with the equation of 
motion for the components. Thus the large N approximation linearizes the problem at the expense 
of a self-consistent condition. The solution for each component is obviously 

Vi ni-^) = Vi iiO) ; Hr) = (27) 

Consider for a moment that the it = 0 mode is slightly displaced at the initial time, then it will roll 
down the potential hill to a final equilibrium position for which M^(oo) 77i(oo) = 0 (so the time 



134 


Boyanovsky and de Wga 


derivative vanishes in equilibrium). If rii{oo) ^ 0 is a broken symmetry minimum oi tlie fie(‘ enei|^y, 
then M^ir) -i 0 when r -t oo. This is the statement of Goldstone’s theorem that guarantr'es that 
the perpendicular fluctuations are soft modes. This asymptotic limit allows the solution of the 
self-consistent condition 


= 1 - (? 7 f(r)) = 1 - A e' 


26 (t) 


/ 


d'^k 

{2-nY 


.-k^r 


1 - A e 


lb{T) 


(Stt/) 


'28) 


The vanishing of the right hand side in the asymptotic time regime h'ads to the sell-c onsiKtent 
solution 


i(r) 


■In 


T 

L^o 


=4^ M^(r) -> 


£ 

4r 


(29) 


where tq is a constant related to A. This self-consistent solution results in the following asymptotic 
behavior 


\ ^ 


-k^T 


(30) 


Introducing the dynamical length scale L{t) = ri it is straightforward to find tho struct ur(‘ fact.or 
and the pair correlation function 


C{f,t) (X e ; £;(/,) r: 


{ 31 } 

( 32 ) 


This behavior should not be interpreted as diffusion, because of the 1/(1) in (M|n. (31) which is a 
result of the self-consistent condition. 

Important Features: 


• The 'effective squared mass’ 0: asymptotically then* an* inassl(\ss cx('itati<ms 

identified as Goldstone bosons. 

• Since M^(t) -4 0 asymptotically, the self-consistent condition rcsult.s in that (ft^) 
Nr{0)/^/X^ i.e. the fluctuations sample the broken symmed-ry states, \vlu(‘h are (H|uilibriuin 
minima of the free energy. These fluctuations begin to grow expone‘nti<illy at (*arly times due 
to spinodal instabilities. 

• A dynamical correlation length emerges L{t) = which determiiK‘s th(‘ siz<* of the ('ona*- 
lated regions or 'domains’. A scaling solution emerges asymi)totically with tht» natural s(7il<* 
determined by the size of the ordered regions. These vngkniH grow with this law until they 
become macroscopically large. Although this a result obtained in the large N limit, similar 
growth laws had been found for NCOP both analytically and mnnericiilly for N - 1 (‘tc.[7] 

• Coarsening: The expression for the structure factor (31) shows that at large* lim(‘s only tlie 

very small wavevectors contribute to Sik,t), however the self-consistem^y condition forces 
the dk S{k,t) constant thus asymptotically S{k,t) is peaked at wavev<*ctt>rs 

k ^ L-i(t) with an amplitude L^{t) thus becoming a delta function SikJ) ^ V (/(k)- 
The position of the peak in S{k^ t) moving towards longer wavelength is the |>!ienonn*non of 
coarsening and is observed via light scattering. At long times a zero momentum con(k*nsatt‘ 
is formed[10] and a Bragg peak develops at zero momentum, this condemsate howevc*r grows 
as a power of time and only becomes macroscopic at asymptotically large times. Cofirs(*iiiing 
is one of the experimental hallmarks of the process of phase ordering, revealecl for i*xam|>lc‘ 
in light scattering [12] and is found numerically in many systems[7]. Thus the large N limit, 
although not being able to describe topological defects offers a very good description of the* 
ordering dynardics. 



Symmetry Breaking out of Equilibrium 


135 


3 Phase ordering in Quantum Field Theory I: Minkowski 
space-time 


3.1 A quench in Q.F.T. 

Although the phenomenological Time Dependent Landau Ginzburg theory is a succesful description 
of phase ordering kinetics in condensed matter systems, there is no first principle derivation from a 
microscopic theory of these equations of motion. Whereas microscopic descriptions either based on 
classical or quantum Hamiltonians lead to time reversal invariant equations of motion, the TDGL 
equations are first order in the time derivative and therefore purely dissipative. 

A first principles, microscopic description of a quantum theory must begin with the Heisenberg 
equations of motion for operators or the Schroedinger or quantum Liouville equations for the quan- 
tum states or density matrix that describes the system. In this section we provide an introduction 
to the treatment of strongly out of equilibrium situations, in particular that of a ‘quench’ in a 
quantum field theory system. 

This is the situation studied in[40] for the dynamics of formation and evolution of disoriented 
chiral condensates during the chiral phase transition. 

The dynamics is completely determined by the microscopic field theoretical Hamiltonian. For 
a simple scalar theory the Hamiltonian operator is given by 


H = ld^x!^^Tl^{x,t) + ^[V^x,t)f+V[§{x,t)]^ (33) 

where $ is the quantum mechanical field and H its canonical momentum. We want to describe a 
quenched scenario where the initial state of the system for t < 0 is the ground state (or density 
matrix, see later) of a Hamiltonian for which the potential is convex for all values of the field, for 
example that of an harmonic oscillator, in which case the wave function(al) is a Gaussian 
centered at the origin. At t = 0 the potential is changed so that for t > 0 it allows for broken 
symmetry states. This can be achieved for example by the following form 


vm 

m^{t) 


-h 

2 4 

f d-mo >0 fori^ < 0 
[ -mo < 0 for it > 0 


(34) 

(35) 


thus the potential in Fig. 1 changes suddenly from T > Tc to T < Tc. Although in Minkowski 
space-time this is an ad-hoc choice of a time dependent potential that mimics the quench[41], we 
will see in the next section that in a cosmological setting the mass term naturally depends on time 
through the temperature dependence and that it changes sign below the critical temperature as 
the Universe cools off. Most of the results obtained in Minkowski space-time will translate onto 
analogous results in a Priedmann-Robertson- Walker cosmology. Unlike the phenomenological (but 
succesful) description of the dynamics in condensed matter systems, in a microscopic quantum 
theory the dynamics is completely determined by the Schrddinger equation for the time evolution 
of the wave function or alternatively the Liouville equation for the evolution of the density matrix 
in the case of mixed states. We will cast our study in terms of a density matrix in general, such a 
density matrix could describe pure or mixed states and obeys the quantum Liouville equation 



Question: How does the wave function(al) or the density matrix evolve after a quench? 


3.2 A simple quantum mechanical picture: 

In order to gain insight into the above question, let us consider a simple case of one quantum 
mechanical degree of freedom q and the quench is described in terms of an harmonic oscillator 



136 


Boyaaovsky and de Vega 


with a time dependent frequency = -e(t) a-g ; uj§ > 0 with €(t) the sign function, so that 
< 0) > 0 ; > 0) < 0. Furthermore let us focus on the evolution of a pure state (the 

density matrix is simple the product of the wave function and its complex conjugate). Consider 
that at t < 0 the wave function corresponds to the ground state of the (upright) harmonic oscillator. 
For t > 0 the wave function obeys 


* dt 


[ 2dq^ 






(37) 


Since the initial wave function is agaussian and under time evolution with a quadratic Hamiltonian 
Gaussians remain Gaussians, the solution of this Schrodinger equation is given by 


nQ,t] 

dlnNjt) 

dt 


.dA 


dt 



= A ujq 


(38) 

(39) 


Separating the real and imaginary parts of A{t) it is straightforward to find that |Ar(t)l^/Re[A(t)] 
is constant, a consequence of unitary time evolution. Eq.(40) can be cast in a more familiar form 
by a simple substitution 

= (40) 

where the equation for (j) was obtained by inserting the above expression for A{t) in (40). The 
solution is (j){t) = a ^ 4- 6 ^ featuring exponential growth. This is the quantum inechiuiical 

analog of the spinodal instabilities described in the previous section. The equal time two-point 
function is given by 

(q2)(t)=A^i(t) = l^(f)P«e2-o‘ (41) 

The width of the Gaussian state increases in time (while the amplitude decreases to maintain a 
constant norm) and the quantum fluctuations grow exponentially. As the Gaussian wave function 
spreads out the probability for finding configurations with large amplitude of the coordinates 
increases. These is the quantum mechanical translation of the linear spinodal instabilities. When 
the non-linear contributions to the quantum mechanical potential are included the single particle 
quantum mechanical wave function will simply develop two peaks and eventually re-collapse by 
focusing near the origin undergoing oscillatory motion between ‘collapses’ and ‘revivals’. In the 
case of a full quantum field theory there are infinitely many degrees of freedom and the energy 
is transferred between many modes. This simple quantum mechanical example paves the way for 
understanding in a simple manner the main features of a quench in the large N limit in quantum 
field theory, to which we now turn our attention. 


3.3 Back to the original question: Large N in Q.F.T. 

We now consider the large N limit of a full Q.F.T. in which 


${x,t) = ($i(x,t),$2(x,0r**,^iv(x,^)) (42) 

and similarly for the canonical momenta fl. The Hamiltonian operator is of the form (33) with 

y[l] = im2(t) !■$ + ^43^ 

mth n?{t) given by (35). Let us focus on the case in which the initial state pure and symmetric, 
i.e. ($) = 0, with < • • • > being the expectation value in this initial state. The more complicated 
case of a mixed state, described by a density matrix is studied in detail in[35, 36, 37] and the main 



Symmetry Breaking out of Equilibrium 


137 


features are the same as those revealed by the simpler scenario of a pure state. The large N limit 
is implemented in a similar manner as in the TDGL example, via a Hartree like factorization 

($. 1)2 ^2(1-1) (44) 

where the expectation value is in the time evolved quantum state (in the Schrodinger picture) or 
in the initial state of the Heisenberg operators (in the Heisenberg picture). Via this factorization 
the Hamiltonian becomes quadratic at the expense of a self-consistent condition as it will be seen 
below. It is convenient to introduce the spatial Fourier transform of the fields as 

k 

with ft the spatial volume, and a similar expansion for the canonical momentum n(x, ^). The 
Hamiltonian becomes 

+ «»-■«-»-} ' («) 

k 

WHt) = + e + ( 47 ) 

The problem now has decoupled in a set of infinitely many harmonic oscillators, that are only 
coupled through the self-consistent condition in the frequencies (47). To induce a quench, the time 
dependent mass term has the form proposed in eq. (35). 

Just as in the simple quantum mechanical case, we consider the initial state to be a Gaussian 
centered at the origin in field space, which is the ground state of the (upright) harmonic oscillators 
for t <0. Since a Gaussian is always a Gaussian under time evolution with a quadratic Hamiltonian, 
we propose the wave function(al) that describes the (pure) quantum mechanical state to be given 
by 

^$,t] = n* {Nk{t) ; Akit = 0) = WkH < 0) (48) 

Time evolution of this wavefunction(al) is determined by the Schrodinger equation: in the Schrodinger 
representation the canonical momentum becomes a differential (functional) operator, Hj^ -i6/ 
and the Schrodinger equation becomes a functional differential equation. Comparing the powers of 
in this differential equation, one obtains the following evolution equations for Nkit) and Ak{t) 

(49) 

Al{t)-Wi{t) (50) 

As in the single paxticle case, the constancy of |Arj,(t)|^/Re[Ajfe(t)] is a consequence of unitary time 
evolution. The non-linear equation for the kernel Ak{t) can be simplified just as in the single 
particle case by writing 

Akit) = -i^ =» Mt) + W^it) <j>kit) = 0 (51) 

and taking the expectation value of in this state we obtain 

(l,--l_,-)(t) = iV|^*(t)|2 (52) 

Hence we find a self-consistent condition much like the one obtained in the large N limit for TDGL. 
The equations for the mode functions and the self-consistent condition for t > 0 are therefore given 
by 

ijikit) + + M^it)] 4>kit) = 0 

Af^t) = -n>S + \j 


.dAk{t) _ 
* dt 


(53) 

(54) 



138 


Boyanovsky and de Vega 


where the integral in the self-consistent term in (54) is simply ($?). There are two fundamental 
differences between the quantum dynamics determined by the equations of motion and the chussical 
dissipative dynamics of the TDGL phenomenological description given in sec. II: 

• The equations of motion and the self-consistency condition equations (53)-(54) lead immedi- 
ately to the conservation of energy[34, 35]. 

• The evolution equations are time reversal invariant. 

These properties must be contrasted to the purely dissipative evolution dictatc'd l)y th(‘ '11)GL 
equations as is clear from eq. (9). Consider a very weakly coupled theory A << 1 and v(‘ry early 
times, then the self-consistent term can be neglected and we see tliat for kr < 7 / 7.5 niod(‘.s grow 
exponentially. This instability again is the manifestation of spinodal gr()wth[42, 43, 44, 35 , 36 ]. 
Since the mode functions grow exponentially, fairly soon, at a time scale //ip^ hi(l/A) tlu^ 
self-consistent term begins to cancel the negative mass squared and M^{t) becomes smallfa*. 
find numerically that this effective mass vanishes asymptotically, as shown in Fig. 3. 

3.4 Emergence of condensates and classicality: 

The physical mechanism here is similar to that in the classical TDGL, but in terms of qmiuturu 
fluctuations. The quantum fluctuations with wave vectors inside the si/inodally unstabli^ band 
grow exponentially, these make the ($^) self-consistent field to grow non-pcrturl/ativi'Iy L'irg(‘ until 
when ^ ttiq/X when the self-consistent (mean) field begins to be of the same ord(‘r as 77/5 (t.lu' 
tree level mass term). At this point the fluctuations IxM'ome non-pcu’turl/at.ivcdy la,rg(» and 

sample field configurations near the equilibrium minima of the potential. The si/inodal instal/ilitit's 
are shutting off since the effective squared mass M‘^{t) is vanishing. 

When M^{t) vanishes, the equations for the mode functions become those of a fret^ massless 
field, with solutions of the form (t)k(t) = + Bk whereas for the k = 0 mode* th<* 

solution must be of the form <i>o{t) = a -f- with a; 6 7 ^ 0 since the Wronskian of the mode* fimetion 
and its complex conjugate is a constant. This in turn determines th<it the low k (long wavekmgth) 
behavior of the mode functions is given by 

^k (t) = a cos kt -f h ( 55 ) 

/c 

This behavior at long wavelength has a remarkable consequence: at very long time thci power 
spectrum |0jb(f)P, which is the equivalent of S{k,t) for TDGL (see eq. (17)) is dominated by 
the small fc-region, in particular k « 1/t, with an amplitude that grows quadratically with 
time. Then the structure factor S{k^t) = \(l>k{t)\‘^ features a peak that moves towards longer 
wavelengths at longer times and whose amplitude grows with time in such a way that asymptotically 
Jq k^S{k,t)dk/2TT'^ ml/X and the integral is dominated by a very small region in k that 
gets narrower at longer times. This is the equivalent of coarsening in the TDGL solution in 
the large N limit, where the asymptotic time regime was dominated by the formation of a long- 
wavelength condensate. Fig. 4 shows the power spectrum at two (large) times displaying clearly 
the phenomenon of coarsening and the formation of a non-perturbative condensate. 

The pair correlation function can now be calculated using this power spectrum [36] 

/ ^sinJfcr \(l>l(t)\ dk . (56) 

At long tinaes and distances the integral is dominated by the very long wavelength modes, in 
particular by the term a sin[A:t]/A: of hence the integral can be done analytically and we find 

C{f,t) = @{2t-r) (57) 

with A a constant. This is a remarkable result: the correlation falls off as 1/r inside domains that 
grow at the speed of light. This correlation function is shown in Fig. 5 at several different (large) 



Symmetry Breaking out of Equilibrium 


139 


times. This correlation function is of the scaling form: introducing the dynamical length scale 
L{t) = t it is clear that[36] 

Cif,t)^L-\t)f{r/Lit)) ; /(s) = (58) 

We interpret these ‘domains’ as being a non-perturbative condensate of Goldstone bosons, with 
a non-perturbatively large number of them a 1/A, such that the mean square root fluctuation of 
the field samples the (non-perturbative) equilibrium minima of the potential. In particular an 
important conclusion of this analysis is that the long- wavelength modes acquire very large ampli- 
tudes, their phases vary slowly as a function of time (for k « 1/it), therefore these fluctuations 
which began their evolution as being quantum mechanical, now have become classical 


3.5 Coherent Structures 

At this point our analysis begs this question. To understand the answer it is convenient to back 
track the analysis to the beginning. The initial quantum state is given by a the wave-function (al) 
(48), thus the most probable field configurations found in this ensemble are those whose spatial 
Fourier transform are given by 


(59) 


(restoring h would multiply by Vh). Then typical long- wavelength field configurations that 
are represented in the quantum ensemble described by this initial wave-function(al) are of rather 
small amplitude. The initial correlations are also rather short ranged on scales Under time 
evolution the probability distribution is given by 






i*l(or 


(60) 


At times longer than the regime dominated by the exponential growth of the spinodally unstable 
modes, the power spectrum obtains the largest support for long wavelengths k « ml 

and with amplitudes ^ mllX. Therefore field configurations with typical spatial Fourier transform 
(j>k {t) are very likely to be found in the ensemble. These field configurations are primarily made 
of long-wavelength modes and their amplitudes are non-perturbatively large, of the order of the 
amplitude of the fields in the broken symmetry minima. A typical such configuration can be 
written as 

t)typical « X) ■ X + Ji] ( 61 ) 

k 

where the phases 5i are randomly distributed with a Gaussian probability distribution since the 
density matrix is gaussian in this approximation. We note that a particular choice of these phases 
leads to a realization of a likely configuration in the ensemble that breaks translational invari- 
ance. In fact translations can be absorbed by a change in the phases, thus averaging over these 
random phases restores translational invariance. Since the quantum state (or density matrix) is 
translational invariant a particular spatial profile for a field configuration corresponds to a partic- 
ular representative of the ensemble. Combining all of the above results together we can present 
the following consistent interpretation of the ordering process and the formation of coherent non- 
perturbative structures during the dynamics of symmetry breaking in the large N limit[36] : 

• The early time evolution occurs via the exponential growth of spinodally unstable long wave- 
length modes. This unstable growth leads to a rapid growth of fluctuations ($^)(t) which in 
turn increases the self-consistent contribution and tends to cancel the negative mass squared. 
The effective mass of the excitations + ^($^)(t) 0 and the asymptotic excitations 

are Goldstone bosons. 



140 


Boyanovsky and de Vega 


• At times larger than the spinodal time ts « moMn(l/A), the effective mass vanishes and 

the power spectrum or structure factor Sik,t) = displays the features of coarsening: 

a peak that moves towards longer wavelengths and increases in amplitude, resulting in a 
long- wavelength condensate at asymptotically long times. 

• For large time a dynamical correlation length emerges L{t) = t and at long distances the 
pair correlation function is of the scaling form C{r,t) oc L~^{t)f(r/L{t)). Th(^ length scale 
L{t) determines the size of the correlated regions and determines that these regions grow at 
the speed of light. Inside these regions there is a non-perturbative condensate of Goldstone 
bosons with a typical amplitude of the order of the value of the homogeneous field at the 
equilibrium broken symmetry minima. 

The similar ity between these results and those of the more phenomenological TDGL (h'scription 
in condensed matter systems is rather striking. The features that are determined by the structure 
of the quantum field theory are[36]: i) the scaling variable s = r/t with equal powers of distance 
and time is a consequence of the Lorentz invariance of the underlying theory, ii) the fact that the 
pair correlation function vanishes for r > 2t is manifestly a consequence of causality. An analysis 
of the correlations and defect density during the spinodal time scale has been performed in{45] and 
related recent studies had been performed in[46]. 


4 Phase ordering in Quantum Field Theory II: FRW Cos- 
mology 

4.1 Cosmology 101 (the basics): 

On large scales > 100 Mpc the Universe appears to be homogeneous and isotropic as revc^aled by 
the isotropy and homogeneity of the cosmic microwave background and some of the recent large 
scale surveys [1]. The cosmological principle leads to a simple form of the metric of sp<ice time, the 
Priedmann-Robertson- Walker (FRW) 'metric in terms of a scale factor that determines the Hubble 
flow and the curvature of spatial sections. Observations seem to favor a fiat Universe for which 
the space time metric is rather simple: 


ds^ = dt^ ~ a^{t) d^ 


(62) 


the time and spatial variables t, x in the above metric are called comoving time and spatial distance 
respectively and have the interpretation of being the time and distances measured by an observer 
locally at rest^with respect to the Hubble flow. At this point we must note that physical distances 
are given by ip/iys(t) = a-(t) x. An important concept is that of causal (particle) horizons: events 
that cannot be connected by a light signal are causally disconnected. Since liglit triivels on null 
geodesics = 0 the maximum physical distance that can be reached by a light signal at time t 
is given by 

It will prove convenient to change coordinates to conformal time by defining a conformal time 
variable 

dt' 


in terms of which the causal horizon is simply given by dff (rj) = C (tj) ri and physical distances as 
Xphys = C{r}) X. This metric is of the same form as that of Minkowski space time. For energies well 
below the Planck scale Mpi w lO^^Gev gravitation is well described by classical General Relativity 
and the Einstein equations: 


R'"' - I 9‘"''R = T>^’' 

2- 3Ml, 


^Pl 


( 65 ) 



Symmetry Breaking out of Equilibrium 


141 


where we have been cavalier and set c = 1 (as well as A = 1). is the Ricci tensor, R the Ricci 
scalar and the matter field energy momentum tensor. The above equation is classical but one 
seeks to understand the dynamics of the Early Universe in terms of a quantum field theory that 
describes particle physics, thus the question: what is exactly the energy momentum tensor?, in 
Einstein’s equations it is a classical object, but in QFT it is an operator. The answer to this question 
is: gravity is classical, fields are quantum mechanical, but i.e. it is the expectation 

value of a quantum mechanical operator in a quantum mechanical state. This quantum mechanical 
state, either pure or mixed is described by a wave-'function(al) or a density matrix whose time 
evolution is dictated by the quantum equations of motion: the Schrodinger equation for the wave 
functions or the quantum Liouville equation for a density matrix. Consistency with the postulate of 
homogeneity and isotropy requires that the expectation value of the energy momentum tensor must 
have the fluid form and in the rest frame of the fluid takes the form (T^^) = diagonal(p,p,p,p) 
with p the energy density and p the pressure. The time and spatial components of Einstein’s 
equations lead to the Friedman equation 


a^(t) _ Stt 
~ 3MJi 
a{t) a?{t) 
a{t) a^t) 


Pit) 


Stt 

Wi 


■Pit) 


( 66 ) 

(67) 


Combining these two equations one arrives at a simple and intuitive equation which is reminiscent 
of the first law of thermodynamics: 


^(pa^(t)) = -p^^^=>p + 3~{p + p) = 0 ( 68 ) 

The alternative form shown on the right hand side of (68) is the covariant conservation of energy. 
Since the physical volume of space is Vo ^^(0 (with Vo the comoving volume) the above equation 
is recognized as dU = dV which is the first law of thermodynamics for adiabatic processes. To 
close the set of equations and obtain the dynamics we need an equation of state p = p{p): two 
very relevant cases are: i) radiation dominated (RD) with p = p/3 and matter dominated (MD) 
p = 0 (dust) Universes. In our study we will focus on the RD case. The equation of state for RD 
is that for blackbody radiation for which the entropy is S = CVT^ (with C a constant). Since 
V{t) = Vo a^{t) is the physical volume, the equation (68) which dictates adiabatic (isoentropic) 
expansion leads to a time dependence of the temperature: T{t) = Tola(t). Now the cooling is 
done by the expansion of the Universe and a phase transition will occur when the Universe cools 
below the critical temperature for a given theory. For the GUT transition Tc ^ 10^® Gev « 
for the EW transition Tc 100 Gev lO^^if. Returning now back to the large N study of the 
dynamics of phase transitions, we can include the effect of cooling by the expansion of the Universe 
by replacing the time dependent mass term m^(t) in (43) by 


{t) = ml 


'T^{t) 



(69) 


This form is consistent with the Landau-Ginzburg description including the time dependence of 
the temperature via the isentropic expansion of the Universe, but perhaps more importantly it 
can be proven in a detailed manner from the self-consistent renormalization of the mass in an 
expanding Universe[34]. Thus the large N limit in a RD FRW cosmology will be studied by using 
the potential (43) but with the time dependent mass given by (69). 


4.2 Large N in Radiation (RD) and Matter (MD) dominated FRW Cos- 
mology 

The large N limit is again implemented via the Hartree-like factorization (44) performing the 
spatial Fourier transforms of the fields and their canonical momenta and including the proper scale 
factors, the Hamiltonian now becomes [34] 



142 


Boyanovsky and da Vega 


H'lW = ^ +>"■'«) 

where now the expectation value is in terms of a density matrix /)[<&(!), <1>(0; w<* ct iusid- 

ering the case of a thermal ensemble as the initial state. 

We propose the following Gaussian ansatz for the functional density matrix (‘hmnuit.s in the 
Schrodinger representation[34] 

=n«.(l)exp {-4^ il ■ i-l + ^ • Vi + B»(<) 'ft • ‘f ,.} (72) 

k 

This form of the density matrix is dictated by the hermitidty condition p^[d>, <i>, <I>, /]; ;us 

a result of this, Bk{t) is real. The kernel Bk{t) determines the amount of mixing in tlie density 
matrix, since if Bk = 0, the density matrix corresponds to a pure stat(’ Ix'cause it. is a wave 
functional times its complex conjugate. The kernels yli;(0) ; Bk{d) are cliosen such tliat the 
initial density matrix is thermal with a temperature Ti > T„[34]. Following the same steps as in 
Minkowski space time, the time evolution of this density matrix can he found in terms of a set of 
mode functions 4>kit) that obey the following equations of motion and self-consistency condition 


^k{t) + 3 - ipk{t) + 
a 


T? 


k‘^ 


m^(t) = m. 


Lr2a^(t) 


[a2(t) 


- 1 


+ m^{t) 


hit) = 0 


■f 


A r (i-’A 

2 J (27r)^ 


\^kit}\‘ coth 


2fi' 


(73) 

(74) 


This equations can be cast in a more familiar form by changing coordinate's to conformal time 
(see eq. (64)) and (conformally) rescaling the mode functions (l>k{t) ~ f k(''l) / C { t]) to obtain the 
following equations for the conformal time mode functions fkiv) b‘ iu* 1^1^ FRW cosmology 


r,:iv) + [e+C\r,)M\r])]fkiv) = 0 


M\r,) 


ml 


fjp2 


[T^C-^iri) 


-1 


-I- 


f 

J W 


\Mv)? ^f(o) 

CHv) 


C"in) 

cPiv) 


where primes now refer to derivatives with respect to conformal time. For RD and Ml) FRW 


(75) 

(76) 




Ciri) = 1 -t- 

C'(r?) = (1 + ^)2 


a"(r?)=0 forRD 
; G"(r;) = 1/8 for MD 


(77) 

(78) 

(79) 


(in units of ttiq ^ which is the only dimensionful variable). The above equations of motion now 
have a form analogous to those in the case of Minkowski space-time. 

As the temperature falls below the critical the effective squared inaSvS term IxK^omes negative* 
and spinodal instabilities trigger the process of phase ordering. This results in that the quantum 
fluctuations quantified by ($^} grow exponentially. These spinodal instabilities make the sc»lf- 
consistent field grow at early times and tends to overcome the negative sign of the squared mass, 
eventuaUy reaching an asymptotic regime in which the total effective mass M^{ri) vanishes. 

Again this behavior determines that the fluctuations are sampling the equilibrium broken syin- 
metry minima of the initial potential, i.e. (1^) 

Although, just as in Minkowski space-time the effective mass vanishes asymptotically, the non- 
equilibrium evolution is rather different We find numerically[38] that asymptotically the effective? 



Symmetry Breaking out of Equilibrium 


143 


mass term behaves as 


C^(r])M^(T]) —15/4t]^ for RD (80) 

C^{Ti)M^{rj) ’' 4 °° - 35 / 47/2 

Fig. 6 displays C‘^{ti)M'^{t]) as a function of conformal time for the case of Tj/Tc = 1.1 with 
Tc a mo/\/A[34, 44] for RD. 

We see that at very early time the mass is positive, reflecting the fact that the initial state is in 
equilibrium at an initial temperature larger than the critical. As time evolves the temperature is 
red“Shifted and cools and at some point the phase transition occurs, when the mass vanishes and 
becomes negative. 

Figure 7 displays 2 a^(^^)(^) vs. 77 in units of ttIq ^ for ^ = 3, ^ = 10"® for an R.D. Universe. 
Clearly at large times the non-equilibrium fluctuations probe the broken symmetry states. 

This particular asymptotic behavior of the mass determines that the mode functions fk [r]) grow 
as 77 ®/^ for RD and for 77 ^/-^ ifc < I /77 and oscillate in the form for k > l/rj. This behavior 
is confirmed numerically [38]. We find both analytically and numerically that asymptotically the 
mode functions are of the scaling form 


/*( 77 ) = A 7/2 for RD (82) 

/fc(7/) =R7?i forMD (83) 

Where A and B are numerical constants and J 2 , 3 {x) are Bessel functions. 

Figure 8 displays rj'~'^\fk{r])\‘^ as a function of the scaling variable kr] revealing the scaling 
behavior in RD, a similar behavior emerges for MD[38]. 

It is remarkable that this is exactly the same scaling solution found in the classical non-linear 
sigma model in the large N limit and that describes the collapse of textures[32], and also within 
the context of TDGL equations in the large N limit applied to cosmology [33]. 

The growth of the long-wavelength modes and the oscillatory behavior of the short wavelength 
modes again results in that the peak of the structure factor S{k^r}) = \fk{v)\^ ^ 
moves towards longer wavelengths and the maximum amplitude increases. This is the equivalent 
of coarsening and the onset of a condensate. 

Although quantitatively different from Minkowsky space time, the qualitative features are sim- 
ilar. Asymptotically the non-equilibrium dynamics results in the formation of a non-perturbative 
condensate of long-wavelength Goldstone bosons. We can now compute the pair correlation func- 
tion C(r, 77 ) from the mode functions solutions to (75) and find that it is cutoff by causality at 
r = 277 . The correlation function computed with the mode functions in the asymptotic regime 
agrees perfectly with that computed from the asymptotic form given by (82). The correlation 
function is depicted in Fig. 9 for two different (conformal) times. 

The scaling form of the pair correlation function is 

C{r, Tj) a 7/2 XRDir/2Tj) for RD C{r, rj) oc 7 /^ XMoir/^rj) for MD , 

where XRD{x) .d^nd xmd{^) si-re hump-shaped functions as shown in fig. 9. 

Clearly a dynamical length scale L{r]) = rj emerges as a consequence of causality, much in the 
same manner as in Minkowsky space time. The physical dynamical correlation length is therefore 
given by (physiv) = — dnit), that is the correlated domains grow again at the speed 

of light and their size is given by the causal horizon. The interpretation of this phenom-enon 
is that within one causal horizon there is one correlated domain, inside which the mean square 
root fluctuation of the field is approximately the value of the equilibrium minima of the tree level 
potential, this is clearly consistent with Kibble’s original observation [ 2 ]. Inside this domain there 
is a non-perturbative condensate of Goldstone bosons [38]. 

Thus we have seen that the phenomenon of scaling, coarsening and the onset of condensates 
during the non-equilibrium dynamics of phase ordering is a universal feature of the process of 



144 


Boyatiovsky and de Vega 


phase ordering. The non-perturbative large N limit has allowed a clear comparison between the 
phenomenological description in condensed matter based on the TDGL, and the microscopic quan- 
tum field theoretical description in Minkowski and FRW space- times. 


5 Conclusions and looking ahead 

In this lectures we have discussed the multidisciplinary nature of the problem of pluise ordering- 
kinetics and non-equilibrium aspects of symmetry breaking. Main ideas from condensed matter 
were discussed and presented in a simple but hopefully illuminating framework and applied to 
the rather different realm of phase transitions in quantum field theory as needed to understand 
cosmology and particle physics. In particular we have emphasized robust features of the process 
of phase ordering kinetics: early stages dominated by spinodal instabilities and the growth of 
correlated regions, the emergence of a dynamical correlation length that determines the size of the 
correlated regions as a function of time and dynamical scalmg at long times. The phenomenon of 
coarsening is a result of this scaling behavior and is reflected in that the peak in the power spectrum 
moves towards longer wavelengths, and asymptotically long times results in a ‘Bragg peak’ that 
signals the onset of macroscopic ordered phases and condensates. The study of condensed matter 
systems was in terms of the phenomenologically succesful Time Dependent Landau Ginzburg theory 
which is purely dissipative and for which there is no first principles derivation from a microscopic 
theory in general. 

We then passed onto the study of the dynamical evolution out of equilibri\im in ciuantum 
field theories both in Minkowsky and FRW space-times by providing a consistent non-perturbative 
framework to study the time evolution of an initially prepared density matrix. 

The large N approximation has provided a bridge that allows to cross from one field to another 
and borrow many of the ideas that had been tested both theoretically and expcudmentally in con- 
densed matter physics. There are, however, major differences between the cond(ULS(‘d matter and 
particle physics-cosmology applications that require a very careful treatment of the^ (iuantum field 
theory that cannot be replaced by simple arguments. The large N approximation in field theory 
provides a robust, consistent non-perturbative framework that allows the study of phase ordering 
kinetics and dynamics of symmetry breaking in a controlled and consistently implementable frame- 
work, it is renormalizable, respects all symmetries and can be improved in a well defined manner. 
This scheme extracts cleanly the non-perturbative behavior, the quantum to classical transition 
and allows to quantify in a well defined manner the emergence of classical stochastic behavior aris- • 
ing from non-perturbative physics. The emergence of scaling and a dynamical correlation length, 
coarsening and the onset of non-perturbative condensates are robust features of the dynamics and 
the Kibble-Zurek scenario describes fairly well the general features of the dynamics, albeit the 
details require careful study, both analytically and numerically. 

We have emphasized that this study has very definite potential experimental implications, in 
QCD if the chiral phase transition occurs out of equilibrium in ultrarelativistic heavy ion collisions 
leads to the possibility of formation of disoriented chiral condensates that are described in the 
same manner as ordering domains in condensed matter. These condensates have a very distinct 
hallmark in that they lead to a very different ratio of neutral to charged pions, this property can 
be measured on an event by event basis with the detectors at the forthcoming heavy ion colliders. 

In cosmology the process of formation of ordered regions that grow after a rapid phase transition, 
the emergence of scaling and a dynamical length scale and coarsening of these domains lead to a 
definite prediction of a ‘red’ power spectrum on scales that have re-entered the causal horizon right 
after recombination. These are the scales that contribute to the temperature anisotropies measured 
by COBE and the forthcoming cosmological experiments. Therefore the study of the dynamics 
of symmetry breaking out of equilibrium in quantum field theory directly bears on experimental 
possibilities in a wide range of energies both in accelerator and cosmological experiments and is 
therefore an endeavour that must be pursued vigorously. 

Of course this is just the beginning, we expect a wealth of important phenomena to be revealed 
beyond the large iV, such as the approach to equilibrium, the emergence of other time scales 



Symmetry Breaking out of Equilibrium 


145 


associated with a hydrodynamic description of the evolution at late times and a more careful 
understanding of the reheating process and its influence on cosmological observables. Although 
within very few years the wealth of observational data will provide a more clear picture of the 
cosmological fluctuations, it is clear that the program that pursues a fundamental understanding 
of the underlying physical mechanisms will continue seeking to provide a consistent microscopic 
description of the dynamics of particle physics and cosmological phase transitions. 


6 Acknowledgements: 

D. B. thanks T. Kibble, W. Zurek and R. Durrer for illuminating conversations, the N.S.F for 
partial support through grant awards: PHY-9605186 and INT-9815064 and LPTHE (University of 
Paris VI and VII) for warm hospitality, H. J. de Vega thanks the Dept, of Physics at the Univ. of 
Pittsburgh for hospitality. We thank NATO for partial support. 


References 

[1] For a comprehensive review of the status of theory and experiment see: Proceedings of the 
'D. Chalonge’ School in Astrofundamental Physics at Erice, edited by N. Sanchez and A. 
Zichichi, 1996 World Scientiflc publisher and 1997, Kluwer Academic publishers. In particular 
the contributions by G. Smoot, A. N. Lasenby and A. Szalay., 

[2] T. W. B. Kibble, J. Phys. A 9, 1387 (1976). M. B. Hindmarsh and T.W.B. Kibble, Rep. Prog. 
Phys. 58:477 (1995). 

A. Vilenkin and E.P.S. Shellard, ‘Cosmic Strings and other Topological Defects’, Cambridge 
Monographs on Math. Phys. (Cambridge Univ. Press, 1994). 

[3] For thorough reviews of standard and inflationary cosmology see: E. W. Kolb and M. S. 
Turner, The Early Universe (Addison Wesley, Redwood City, C.A. 1990). A. Linde, Par- 
ticle Physics and Inflationary Cosmology^ (Harwood Academic Pub. Switzerland, 1990). R. 
Brandenberger, Rev. of Mod. Phys. 57,1 (1985); Int. J. Mod. Phys. A2, 77 (1987). 

[4] For more recent reviews see: M. S. Turner, astro-ph-9703197;astro-ph-9703196;astro~ph- 
9703174;astro-ph-9703161; astro-ph-9704062. 

[5] See for example: K. Rajagopal in ‘Quark Gluon Plasma 2’, (Ed. R. C. Hwa, World Scientiflc, 
1995). 

[6] For a thorough discussion of phase transitions see: N. Goldenfeld, ‘Lectures on Phase Transi- 
tions and the Renormalization Group’, (Addison- Wesley, 1992). 

[7] A. J. Bray, Adv. Phys. 43, 357 (1994). 

[8] J. S. Langer in ‘Solids far from Equilibrium’, Ed. C. Godreche, (Cambridge Univ. Press 1992); 
J. S. Langer in ‘Far from Equilibrium Phase Transitions’, Ed. L. Garrido, (Springer- Verlag, 
1988); J. S. Langer in ‘Fluctuations, Instabilities and Phase Transitions’, Ed. T. Riste, Nato 
Advanced Study Institute, Geilo Norway, 1975 (Plenum, 1975). 

[9] G. Mazenko in in ‘Far from Equilibrium Phase Transitions’, Ed. L. Garrido, (Springer- Verlag, 
1988). 

[10] C. Castellano and M. Zannetti, cond-mat/9807242; C. Castellano, F. Corberi and M. Zannetti, 
Phys. Rev. E56, 4973 (1997); F. Corberi, A. Coniglio and M. Zannetti, Phys. Rev. E51, 5469 
(1995). 

[11] W. H. Zurek, Nature 317, 505 (1985); Acta Physica Polonica B24, 1301 1993); Phys. Rep. 
276, 4 (1996). 



146 


Boyanovsky and de Vega 


[12] W. I. Goldburg and J. S. Huang, in ‘Fluctuations, Instabilities and Phase Transitions’, Ed. 
T. Riste, Nato Advanced Study Institute, Geilo Norway, 1975 (Plenum, 1975); J. S, Huang, 
W. I. Goldburg and M. R. Moldover, Phys. Rev. Lett. 34, 639 (1975). 

[13] For a nicely written recent review on the dynamics of phase transition see: A. Gill, ‘Contem- 
porary Physics’, vol. 39, number 1, pages 13-47 (1998). 

[14] I. Chuang, R. Durrer, N. Turok and B. Yurke, Phys. Rev. Lett. 66, 2472 (1990). 

[15] M. Bowick, L. Chandar, E. Schiff and A. Srivastava, Science 263, 943 (1994). 

[16] P. C. Hendry, N. S. Lawson, R. A. M. Lee, P. V. E. McClintock and C. D. H. Williams, 
Nature, 368, 315 (1994). 

[17] V.M.H. Ruutu et. al., cond-mat/9512117. Y. M. Bunkov and 0. D. Timofeevskaya, cond- 
mat/9706004. 

[18] For recent reviews on the QCD phase transitions and aspects of relativistic heavy ion collisions 
see for example: J. W. Harris and B. Muller, Annu. Rev. Nucl. Part. Sci. 46, 71 (1996). B. 
Muller in Particle Production in Highly Excited Matter, Eds. H.H. Gutbrod and J. Rafelski, 
NATO ASI series B, vol. 303 (1993). B. Muller, The Physica of the Quark Gluon Plasma 
Lecture Notes in Physics, Vol. 225 (Springer- Verlag, Berlin, Heidelberg, 1985); K. Rajagopal 
in ‘Quark-Gluon Plasma 2’, Ed. by R. C. Hwa (World Scientific, Singapore) (1995); H. Meyer- 
Ortmanns, Rev. of Mod. Phys. 68, 473 (1996). C-Y Wong, ‘Introduction to High-Energy Heavy 
Ion Collisions’, (World Scientific, 1994). 

[19] J. D. Bjorken, Phys. Rev. D27, 140 (1982). 

[20] K. Rajagopal and F. Wilczek, Nucl. Phys. B399, 395 (1993); K. Rajagopal and F. Wilczek, 
Nucl. Phys. B404, 577 (1993). 

[21] A. A. Anselm and M. G. Ryskin, Phys. Lett. B266, (1991) 482; J. D. Bjorken, K. L. Kowalski 
and C. C. Taylor, SLAC Report No. SLAC-PUB-6109 (unpublished); J. - P. Blaizot and A. 
Krzywicki, Phys. Rev. D46, 1992 (246); J. D. Bjorken, Int. J. Mod. Phys. A7, (1992) 4189; 
J. D. Bjorken, Acta Physica Polonica B23, (1992) 561; K. L. Kowalski and C. C. Taylor, 
‘Disoriented Chiral Condensate: A White Paper for the Full Acceptance Detector’ CWRU 
report 92- he-ph/9211282 (unpublished); J. D. Bjorken, K.L. Kowalski and C. C. Taylor, 
‘Baked Alaska’, Proceedings of Les Rencontres de Physique del Valle d’Aoste, La Thuile 
(1993); (SLAC PUB 6109). G. Amelino-Camelia, J. D. Bjorken, S. E. Larsson, Phys.Rev. D56 
(1997) 6942; J. D. Bjorken, Acta Phys.Polon. B28 (1997) 2773; A. Anselm, Phys. Lett. B217, 
169 (1989). 

[22] L. T. Baradzei et. al. Nucl. Phys. B3T0, (1992) 365. 

[23] S. Gavin, A. Gocksch and R. D. Pisarski, Phys. R«v. Lett, 72, 2143 (1994); S. Gavin and 
B. Muller, Phys. Lett. B329, 486 (1994); Z. Huang and X.-N. 'Wang, Phys. Rev. D49, 4335 
(1994); Z. Huang, M. Suzuki and X-N. Wang, Phys. Rev. D50, 2277 (1994); Z. Huang and 
M. Suzuki, Phys. Rev. D53, 891 (1996); M. Asakawa, Z. Huang and X. N. Wang, Phys. Rev. 
Lett. 74, 3126 (1995); J. Randrup, Nucl.Phys. A616 (1997) 531; J. Randrup, Phys. Rev. Lett. 
77 (1996) 1226. 

[24] D. Boyanovsky, H. J. de Vega and R. Holman, Phys. Rev. D51, (1995) 734; F. Cooper, Y. 
Kluger, E. Mottola and J. P. Paz. Phys. Rev. D51, (1995) 2377 . Y. Kluger, F. Cooper, E. 
Mottola, J. P. Paz and A. Kovner, Nucl. Phys. A590,(1995) 581. 

[25] WA98 Collaboration, (M. M. Aggarwal et. al.) Phys. Lett. B B420, (1998) 169. 

[26] J. Streets, hep-ex/9608012; T. C. Brooks et. al. Phys. Rev. D55, (1997), 5667; M. E. Convery, 
hep-ex/9801020. 



Symmetry Breaking out of Equilibrium 


147 


[27] see the RHIC project page with detailed description of the physics capabilities of STAR and 
PHENIX at http://www.rhic.bnl.gov 

[28] see the Castor project page at the Alice web page, 
http://wwwl.cern.ch/ALICE/projects.htinl. 

[29] D. Boyanovsky, D. Cormier, H. J. de Vega, R. Holman and S. P. Kumar,Phys. Rev. D. 57,2166 
(1998); D. Boyanovsky, D. Cormier, H. J. de Vega and R. Holman, Phys.Rev. D55 (1997) 3373. 

[30] D. Boyanovsky, H. J. de Vega and R. Holman, hep-ph/9903534 and in preparation. 

[31] C. L. Bennet, M. S. Turner and M. White, Physics Today, 50 NOV 32, (1997) and references 
therein. 

[32] N. Turok and D. N. Spergel, Phys. Rev. Lett. 66, 3093 (1991); D. N. Spergel, N. Turok, W. 
H. Press and B. S. Ryden, Phys. Rev. D43, 1038 (1991). 

[33] J. A. N. Filipe and A. J. Bray, Phys. Rev. E50, 2523 (1994); J. A. N. Filipe, (Ph. D. Thesis, 
1994, unpublished). 

[34] D. Boyanovsky, H. J. de Vega and R. Holman, Phys. Rev. D 49, 2769 (1994); D. Boyanovsky, 
D. Cormier, H. J. de Vega, R. Holman et S. Prem Kumar, Phys. Rev. D57, 2166, (1998), 
(and references therein). 

[35] D. Boyanovsky, H. J. de Vega, R. Holman, D.-S. Lee and A. Singh, Phys. Rev. D51, 4419 
(1995). D. Boyanovsky, H. J. de Vega and R. Holman, Proceedings of the Second Paris Cos- 
mology Colloquium, Observatoire de Paris, June 1994, pp. 127-215, H. J. de Vega and N. 
Sdnchez, Editors (World Scientific, 1995); Advances in Astrofundamental Physics, Erice Cha- 
longe School, N. Sanchez and A. Zichichi Editors, (World Scientific, 1995). D. Boyanovsky, 
H. J. de Vega, R. Holman and J. Salgado, Phys. Rev. D54, 7570 (1996); D. Boyanovsky, D. 
Cormier, H. J. de Vega, R. Holman, A. Singh, M. Srednicki; Phys. Rev. D56 (1997) 1939. 
D. Boyanovsky, H. J. de Vega and R. Holman, Vth. Erice Chalonge School, Current Topics 
in Astrofundamental Physics, N. Sanchez and A. Zichichi Editors, World Scientific, 1996, p. 
183-270. D. Boyanovsky, M. D’Attanasio, H. J. de Vega, R. Holman and D. S. Lee, Phys. 
Rev. D52, 6805 (1995). D. Boyanovsky, H. J. de Vega, R. Holman and J. Salgado, Phys. Rev. 
D57, 7388 (1998). 

[36] D. Boyanovsky, H. J. de Vega, R. Holman and J. Salgado, hep-ph/9811273, to appear in Phys. 
Rev. D. 

[37] F. Cooper, S. Habib, Y. Kluger, E. Mottola, Phys.Rev. D55 (1997), 6471. F. Cooper, S. Habib, 
Y. Kluger, E. Mottola, J. P. Paz, P. R. Anderson, Phys. Rev. D50, 2848 (1994). F. Cooper, 
Y. Kluger, E. Mottola, J. P. Paz, Phys. Rev. D51, 2377 (1995); F. Cooper and E. Mottola, 
Mod. Phys. Lett. A 2, 635 (1987); F. Cooper and E. Mottola, Phys. Rev. D36, 3114 (1987); 
F. Cooper, S.-Y. Pi and P. N. Stancioff, Phys. Rev. D34, 3831 (1986). 

[38] D. Boyanovsky and H. J. de Vega, in preparation. 

[39] L. F. Cugliandolo and D. S. Dean, J. Phys. A28, 4213 (1995); i6zdL453, (1995); L. F. Cuglian- 
dolo, J. Kurchan and G. Parisi, J. Physique (Prance) 4, 1641 (1994). 

[40] See D. Boyanovsky, R. Holman and H. J. de Vega in [24], and the first reference in [37]. 

[41] Relaxing the assumption of an instantaneous quench and allowing for a time dependence of the 
cooling mechanism has been recently studied by M. Bowick and A Momen, hep-ph/9803284. 

[42] E. J. Weinberg and A. Wu, Phys. Rev. D36, 2474 (1987); A. Guth and S.-Y. Pi, Phys. Rev. 
D32, 1899 (1985). 



148 


Boyanovsky and de Vega 


[43] D. Boyanovsky and H. J. de Vega, Phys. Rev. D47, 2343 (1993); D. Boyanovsky Phys. Rev. 
E48, 767 (1993). 

[44] D. Boyanovsky, D.-S. Lee and A. Singh, Phys. Rev. D48, 800 (1993). 

[45] G. Karra and R.J.Rivers, Phys. Lett. B414 (1997), 28; R.J. Rivers, 3rd. Colloque Cosmologie, 
Observatoire de Paris, June 1995, p. 341 in the Proceedings edited by H J de Vega and 
N. Sanchez, World Scientific. A.J. Gill and R.J. Rivers, Phys.Rev. D51 (1995), 6949; G.J. 
Cheetham, E.J. Copeland, T.S. Evans, R.J. Rivers, Phys. Rev. D47 (1993), 531 6. 

[46] ‘Defect Formation and Critical Dynamics in the Early Universe’, G. J. Stephens, E. A. 
Calzetta, B. L. Hu, S. A. Ramsey, gr-qc/9808059 (1998). ‘Counting Defects in an Instan- 
taneous Quench’, D. Ibaceta and E. Calzetta, hep-ph/98 10301 (1998). 






Symmetry Breaking out of Equilibrium 


151 



Figure 3: M^(t) vs. t, g= 10 ’’’ 









152 


Boyanovsky and de Vega 



Figure 4; g|<^fc(T = 100,200)^ vs. q = A:/|mo|, g = 10 





grG(r, t=1 00,200,300) 


Symmetry Breaking out of Equilibrium 


153 



Figure 5: grC{r,T) vs r/|mo| for t/|mo| = 100, 200, 300 for p = 10 










C^2(eta) M^2(eta) as a function of eta 


154 


Boyajjovsky aaid de Vega 



Figure 6; vs. 77(conformal time in units of tuq for ^ = 3, p = 10~®. R.D. Universe. 



Symmetry Breaking out of Equilibrium 


155 



Figure 7: vs. 17 (conformal time in units of for ^ = 3, g = 10 R.D. 

Universe. 









Symmetry Breaking out of Equilibrium 


157 



Figure 9: C{r,ri) vs. r for 77 = 250,400 (in units of for ^ = 3, p = 10 ®. R.D. FRW 
Universe. 









ORSAY LECTURES ON CONFINEMENT (I) 


by 

Vladimir N. Gribov* 

L. D. Landau Institute for Theoretical Physics 
Acad, of Sciences of the USSR, Leninsky pr. 53, 117 924 Moscow, Russia 

and 


KFKI Research Institute for Particle and Nuclear Physics 
of the Hungarian Academy of Sciences 
H - 1525 Budapest 114, P.O.B. 49, Hungary 

and 

Laboratoire de Physique Theorique et Hautes Energies** 
Universite de Paris XI, batiment 211, 91405 Orsay Cedex, Prance 


LPTHE Orsay 92/60 
June 1993 


* Supported in part by Landau Institute-ENS Departement de Physique 
exchange program 

** Laboratoire associe au Centre National de la Recherche Scientifique 

(Courtesy of Y. Dokhshitzer, Ewarz and J- Nyiri) 



Lectures On Con&iement (I) 


159 


1. The theory of supercharged nucleus 

This talk is just introductory. Literally it is not about confinement, but 
it is important since it is closely connected to the theory of quark confine- 
ment. In fact, this theory deals with two different problems. The first one is, 
how to confine particles, i.e. the problem of producing forces strong enough 
to prevent quarks from separating, which will be discussed later. But there 
is also another, very severe problem. 

We know, that quarks are light, almost massless particles. The question 
is, how to bind massless particles in a volume which is much smaller than 
the Compton wave length. Usually this is not easy to do, because the wave 
function is decreasing exponentially like which in the best case 

can be But if the mass is very small, the state is very broad, and it is 

in complete contradiction with what we know about hadrons. 

The mechanism of supercharged nuclei, which we will discuss in the 
present talk, seems to be a unique possibility to bind a particle in a small 
region in space. The theory of the supercharged nucleus is very old. It was 
initiated in the forties by the work of Pomeranchuk and Smorodinsky [1], 
and it became very well developed, with no unsolved questions left. Since 
we, however, want to apply this theory to quarks, we will talk about it in a 
way slightly different from what people are used to, combining the picture of 
the Dirac sea in external field and the language of Feynman’s Green frmction. 

The problem of the supercharged nucleus is the following. If there is a 
nucleus Nz with a charge Z, and if Z is larger than a critical value Zc 


Z > Zc 



160 


Vladimir N Gribov 


(theoretically Zc = 137, practically it is around Zc ~ 180), then this nucleus 
will decay to an atom with a charge Z — 1 and a positron: 

Nz Az-i + e"^ . 

This atom can be stable or unstable. If it is unstable,it can decay again - 
up to a situation, when the total charge of the atomic state Az-n becomes 
sufficiently small. This is a peculiar thing : it means, that the nucleus behaves 
like a resonance. It does not exist in the nature freely, but it exists inside the 
atom. In this sense it is analogous to what we know about the quarks. This, 
of course, is not confinement, but in some respects it is not so different. 

Indeed, the nucleus has a baryon number B and a lepton number zero. 
But this state with lepton number 0 does not exist, there are only states for 
which the lepton number equals unity. In this sense, it is a confinement of 
states with zero lepton number. 

Theoretically the described problem is very well defined. If we want to 
understand the new type of atomic states, we have to consider the interaction 
of the electron with the nucleus, taking into account all possible interactions. 



Lectures On Confinement (I) 


161 


In this case there are some simplifications. First, 


a << 1 , 


( 1 ) 


but 


caZ ~ 1 . 


( 2 ) 


The third important simplification is that the ratio of the electron mass me 
and the nucleon mass mj\; is much smaller than unity : 


rriN 

Because of these conditions, the problem can be solved exactly. First, due 
to (1), we can neglect all the corrections to electron propagation, since they 
would be of the order of a, and leave those diagrams which contain lines 
connecting electron and nucleus lines and radiative correction to the nucleus 
line. But also, because of (3), the recoil of the nucleus due to photon emission 
will be very small, and instead of writing this interaction and taking into 
account all the sequences, including recoil, we can write that this is equal to 
the electronic Green function in the external field of the nucleus, multiplied 
by the nuclear Green function with radiative corrections 




^ X X x^ 



= Ge(A)GN 


( 4 ) 



162 


Vladimir N Gribov 


Here Ge(A) is the Green function of the electron in the external field ; the 
Green function of the nucleus Gn, which corresponds to the diagram a 





contains a lot of radiative corrections. We have to include loops like b since 
they are of the order of aZ, and diagrams c being Fortunately, all this 
turns out to be very simple. The reason is, that all the photons in these 
diagrams cause no recoil for the nucleus. It means, that we can write the 
following : 


+ 




0 , 



( 5 ) 


where everything can be factorized. \ ; will be the symbolic expres- 
sion for the electrostatic energy of the nucleus, and gives us the 

mass renormalization for the nucleus due to electron vacuum polarization. 
And thus we have an exact description for Gpj. Further, the first step is, of 
course, to calculate the electron Green function in an external field, which 
also defines the nucleus mass renormalization entering the exponent in (5). 



Lectures On ConGnement (I) 


163 


This means, that with the knowledge of the Green function of the electron 


in the Coulomb field, we can calculate everything. 

The equation for the electron Green function is 

VGe = -iS (6) 

'f^id^-iA^)Ge = -iS ( 7 ) 

Since the potential does not depend on the time, we can always write 

Geix2,Xi) = J ( 8 ) 

and, because of that, we have 

7 ^( 5 ;^ =0 , ( 9 ) 

or, in a more convenient way, 

[ojpy + m7o -(cj - A)]^^ = 0 . (10) 


Before discussing the solution of the equation, let us consider the well-known 
features of the spectrum of this system. 


CD 




-m 

m 





164 


Vladimir N Gribov 


There are cuts going from m to oo and from — m to — oo, corresponding to 
the continuous spectrum, and sequences of poles corresponding to Coulomb 
levels. 

Now - how does the critical phenomenon come in ? If we increase Z, 
the pole which corresponds to the atom in the ground state moves to the 
left, passing zero without any troubles (for a finite-size nucleus) and reaches 
the point — m at the value Z = Zc- With the further growth of Z > Zc the 
position of the level ojq is going to the complex plane and the state becomes 
unstable. However, at this point we come to a paradox : this u>o is supposed 
to be the energy of an atomic state, i.e. the energy of the atomic state 
becomes complex, contradicting the physics we have expected. Indeed, we 
thought that the nucleus was unstable and would decay on a stable atom 
and a positron. 

This means, that the described simple solution has to be essentially 
changed, since, apparently, the Green function Ge{A) of the electron does 
not reflect the whole physics, and the features of GV in (4) turn out to be 
important. We will show, in fact, that there is a cancellation between Ge(A) 
and Giv. 

Let us consider now, what is happening from the point of view of the 
Dirac sea. From m to oo and from — m to — oo there is a continuous spectrum 
of electron states in the Coulomb field. Also, as we said before, there is a 
resonance state with complex energy cjq + *T/2, where u>o < —m. What is 
the physics of this resonance ? 



Lectures On ConBnement (I) 


165 



The Dirac equation can be written in the form 

iT + A)i; = 0 ( 11 ) 

where T is the kinetic energy and A the potential energy. This kinetic energy 
can have two signs T — ±\/p^ + and therefore the total energy can be 
quasi- classically either 

+m^+ A{r) = E (12) 

or 

—Vp^ + + A(r) = E . (13) 

Obviously, (12) can be fulfilled only close to the origin. In this case the 
potential is negative, the square root is positive, and the total energy is 
negative. Classically, the electron will be stopped at a return point rj ,where 
the energy E is the sum of the potential and the mass m : 


m -f- A{ri ) = E 


(14) 



166 


Vladimir N Gribov 


This means, that in this region there is a normal wave function which os- 
cillates at r < ri- After that it has to decrease exponentially because of 
the absence of classical trajectories. For the equation (13) the situation is 
the opposite, it can exist only at large distances, since the potential at large 
distances is small, and the energy is negative ; consequently, there will be 
another return point r 2 , where 

-m + A(r 2 ) = E (15) 

Again, there will be a plane wave, i.e. normal oscillation at r > r 2 with 
a subsequent exponential decrease due to the absence of classical motion 
between the points ri and r 2 



The situation is somewhat similar to that with a barrier in the non-relativistic 
problem ; 



Lectures On ConSneaient (I) 


167 



This barrier-type behaviour can be easily seen, if one writes a second order 
equation instead of the first order one. Indeed, let us re-write the Dirac 
equation in the form 

-h ^ = 0 . (16) 

Here 

V^-(d^-iA? = --d^r-uj^ + 2ujA-A^ . (17) 

r 

In this equation the effective potential energy is 2ojA — A? . If A is negative, 
the first term 2(joA corresponds to attraction at large distances, and the 
second term —A^ to repulsion at short distances. 

It is natural to expect, that this negative energy state is just a resonance. 
But : in the previous discussion we forgot about the Pauli principle. We have 
been talking only about one state and not about the Dirac sea. However, 
Dirac sea means, that all these normal negative levels have to be occupied. 



168 


Vladimir N Gribov 


In this case, if our particle will try to go out and pass through this barrier, 
there will be no place for it. When we find, that the Green function has 
a complex singularity, this reflects the fact, that we did not say up to now, 
what type of Green function we are using. 

Imagine, that the Dirac sea is not filled up. In this case it would be 
natural for our state to have complex energy. On the other hand, we are 
used to the fact, that the Feynman Green function reflects the Pauli principle 
in the correct way, and therefore, if we are looking for the Feynman Green 
function, we expect to find the proper answer. 

Suppose now, that we will calculate the Feynman Green function. Still, 
we said that there is a general solution for the Green function with a complex 
pole. The question is, how this happens. 

The answer is essentially very simple. Indeed, we are looking for a Green 
function which is a sum of diagrams of the type 



(18) 


If the Feynman Green function contained only positive energy propagat- 
ing in positive time, all the energy denominators in (18) 

1 

e — a/ 

would be real at negative total energy. But the Feynman Green function 
contains also negative energy, propagating in negative time. It corresponds 
to the so-called .2^-diagram 



Lectures On ConjSnement (I) 


169 



Perturbatively this diagram contains also only positive energy intermedi- 
ate states. However, if the incoming and outgoing lines will correspond to 
negative energy bound states, then the three-particle intermediate state con- 
taining two negative energy particles and one positive energy particle can 
have negative total energy : 

1 1 

€ — \/ro?‘ + — 2e e + -\-p^ 

This means, that the Feynman Green function in this case obtains an imag- 
inary position of the pole in contradiction with the Pauli principle. (This 
contradiction is obvious from the second of the Z-diagrams, since we have 
there two particles in the same state at the same time). It is also clear, that 
this diagram reflects not the decay of a state, but that of the vacuum. 

Let us see, how the Green function of the nucleus will recover the Pauli 


principle. Consider the diagram 



170 


Vladimir N Gribov 


G 



0,t’ 



The Green function of the electron is in this case 

= y' • (19) 

In order to calculate it, we have to write the following contour of integration 
(for Z < Zc) 



With Z growing, the pole will move (as indicated by the dotted line) to 
the complex plane, 






172 


Vladimir N Gribov 


As we see, the Green function will contain a growing exponent which will be 
defined by the imaginary part of the pole. At large t we will be left with an 
unstable pole contribution 

G.(t,ry) = J ~ Mr) ~ . 

Obviously, the total of the diagrams can not contradict the Pauli principle, 
and due to this fact there has to be a cancellation between the electron 
Green function and the Green function of the nucleus. The latter contains 
contributions of the type 



and has singularities connected with atomic states 



This means, that the Green function of the nucleus will contain this loop, 
which, definitely, has an imaginary part, because the nucleus becomes heavier 
than the atom and the positron. It gives us a contribution which describes 



Lectures On Confinement (I) 


173 


the instability and results in an exponent e corresponding to the decay of 
the nucleus 





-rt-iMt 

e 


Hence, will cancel and 


G 


e 



will be proportional to which corresponds to the propagation of 

an atomic state with real energy wq + M 

<\AAAAAAAAAAf 


So, indeed, we have shown, that in this case the bound state is stable not only 
due to the existence of binding forces, but also because of the Pauli principle, 
which reflects the antisymmetry between our electron and the electron in the 
vacuum. In the electron-nucleus scattering amplitude (4) asymptotically (i.e. 

at large t) only the part 



174 


Vladimir N Gribov 



survives in the sense 



because the Green function of the nucleus decays always exponentially, and 
it has to be compensated by the increase of the electron Green function. This 
is possible only, if 

r(t2 ~ 1 “ ^i) 1 5^2 ^ ^ • 

Let us see now, how we have lost the Pauli principle in discussing Feyn- 
man Green functions. This can be understood immediately. Our aim is to 
derive the Feynman prescription from the Dirac picture. In the latter we 
have negative levels filled up, and we look for the propagation of additional 
particles. 



Lectures On ConSnement (I) 


175 



How can we derive Feynman rules from this ? It wiU be, indeed, very simple. 
In this picture our Green, function 

G 

X X 

1 2 


has, of course, to be retarded : 



The real propagation is second order in external field : 



( 20 ) 



176 


Vladimir N Gribov 


We start with the usual calculation, but we have to include the Dirac sea. 
The external field A is acting not only on the particles, but also on the Dirac 
sea. This means, that we have to consider our particle and a particle from 
the Dirac see ; 



will be a possible diagram. However, we can write also 



and consider the diagrams 



( 21 ) 


and 



Lectures On ConGnement (I) 


177 


X 

1 


■X 


X 

2 


( 22 ) 


Averaging over all particles in the Dirac sea, the diagram (21) leads to 



which corresponds to the first order contribution to vacuum polarization and 
is in fact zero. The diagram (22) gives 



and we get 



178 


Vladimir N Gribov 




X 

1 



(23) 


summing all negative energy levels. Adding the diagram with Gr (20) to 
that with G- (23), we obtain the Feynman diagram. By these two diagrams, 
however, symmetry is not reflected. In order to be symmetrical, we have to 
add 


X 

1 



X 

2 


which means that the external field is interacting twice with the vacuum. 
But averaging this, we get 

• ■ • 



i.e. we have to add a diagram which corresponds to vacuum polarization in 
external field. In other words, the Pauli principle is taken into account by 
adding vacuum polarization to the normal Feynman propagator. 



Lectures On ConSnement (I) 


179 


It is, of course, always necessary to consider vacuum polarization, but 
sometimes it is relevant, sometimes not. In the case of weah coupling the 
vacuum polarization was real and, when added to the Feynman propagator, 
did not change the picture essentially. But in our case, in the case of strong 
coupling, the vacuum polarization becomes complex and because of this, it 
defines everything. 

In conclusion, let us make it clear, which region of distances between 
the nucleus and the electron is important in our calculation. 

As it is shown in [2], the solution of the Dirac equation (10) is 


4>Lj{r) ~ 


1 

—pzCOS 


\/{Zcx)‘^ — l^n— + 8 
ro 


(24) 


in the region 


ro << r << 


1 

to 


where ro is the radius of the nucleus. If |c<;|ro << 1, the function will 

oscillate and at tq — > 0 this corresponds to ’’falling into the center” . If ro is 
finite, the ’’falling into the center” does not occur. However, the existence of 
oscillations is a sign which indicates that the levels passed through the point 
u) = —m. 

The number of oscillations n in the region ro < r < ^ which determines 
the number of levels passing through is defined by the condition 


J(Zaf - 1 in— + 6 = 


nTT 


mro 


(25) 



180 


Vladimir N Gribov 


At n = 1 formula (25) provides the condition for the charge to be supercritical 
(with ro and m finite) and shows, that the region where the supercritical 
phenomenon occurs is 


?'o << ^ ^ — • 
m 


Acknowledgements 

The text of these lectures was written by Julia Nyiri. She substantially 
expanded and edited the original notes she took during the lectures. I ana 
very grateful to Ph. Boucaud, A. Dudas and J. Mourad who formated the 
text and the figures. Also, I would like to thank the Lab. de Physique 
Theorique et Hautes Energies, and especially A. Capella, M. Fontannaz, A. 
Krzywicki, D. Schiff and Tran Than Van for their hospitality, and the warm 
and inspiring atmosphere during my stay at Orsay. 

References 


[1] I. Pomeranchuk, Ya. Smorodinsky, Journ. Fiz. USSR 9 (1945) 97. 

[2] Ya. Zeldovich, V. Popov, Uspekhi Fiz. NauklOS (1971) 4. 



Orsay Lectures On Confienement (II) 


Vladimir N. Gribov* 

L. D. Landau Institute for Theoretical Physics Acad, of Sciences of the USSR 
, Leninsky pr. 53, 117 924 Moscow, Russia 
and 

KFKI Research Institute for Particle and Nuclear Physics 
of the Hungarian Academy of Sciences, 

H-1525 Budapest 114, P.O.B. 49, Hungary 
and 

Laboratoire de Physique Theorique et Hautes Energies^ 

Universite de Paris XI, batiment 211, 91405 Orsay Cedex, France 


LPTHE Orsay 94-20 
February 1994 


‘Supported in part by Landau Institute-BNS D4partement de Physique exchange program 
^Laboratoire associe au Centre National de la Recherche Scientifique 



182 


Vladimir N Gribov 


1 The confinement of the heavy quark 

In the previous talk [1] we have considered some aspects of the theory of supercharged nuclei, and 
came to the conclusion, that the superbound atoms are stable mainly due to the Pauli principle. 

Before going to the heavy quark, let us discuss briefly, how the problem of the supercharged 
nucleus can be handled practically. One possibility to formalize it is the following. We have to 
calculate the Green function of the Dirac equation in external field : 

V G(x,x') = \s(x - x') , (1) 

% 

where 

V = 7m • (^) 

The initial external field is that of the considered nucleus. However, in the presence of a charge Z 
the stationary states correspond to atomic states with a charge Z — N where N is large enough 
to fulfil Z - N < Zcr, i-e. there have to be N electrons rotating around the nucleus. But if so, we 
will have to take into account, that the field which acts on each electron is also changing. 

The ground state in our case is an atomic type state. This means, that although we don’t know 
what our field is, we can find it. The gradient squared of the zeroth component Aq = ^4 of the 
potential has to be equal 

V Aq = Cq {Pnucl d" pel) • (3) 

where pe£ =< 'ip'jo'ip >. 

The charge density will depend on the potential, and the equation becomes a Thomas-Fermi 
type equation for the potential existing in this system. If we find a wave function satisfying the 
equation we will know A, the Green function and p. This means, that the formal way of solution 
is just to solve a self-consistent, Thomas-Fermi type equation for the effective atomic potential. 
Let us consider the total charge Q 

Q = J ; (4) 

the density is the average value 

P=<'ip{r)i}{r) >='^‘ipn{r)ipn{r) (5) 

u ;<0 

which in the Dirac picture is the sum of all negative energy levels. The total charge Q will be the 
sum of the energy levels over all cjn- -The sum is divergent, and we have to make a cut-off and 
subtract the bare particles. This, however, will not be enough. Indeed, if we just subtract the 
value of the charge of the free vacuum, then a.t Z ^ 0 the charge of the vacuum (i.e. of the nucleus) 
will not be equal Z and will continue to diverge logarithmically. We have to subtract the value at 
small Z in a way which gives zero for the total charge of the electron vacuum (this corresponds to 
the correct renormalization of the charge of the nucleus). 

The above procedure does not reflect literally the subtraction of the vacuum charge without 
external field. This becomes especially obvious, if one makes use of the Levinson theorem which 
connects the number of states of a particle in external field in a given energy interval with the 
phase of scattering of this particle in the same field. The number N of additional states in the 
energy interval between Ei and is defined by the difference of phase shift 

iV = i [5(^2) - <5(£;i)] (6) 

Because of this, the number of new states in the Dirac vacuum equals 

N=- - <5(-oo)] 

TT 


( 7 ) 



Lectures On Con&enement [II] 


183 


For Z < Zc this number should be zero. However, for the Dirac equation in external field this 
condition is not fulfilled, because (5(— oo) ^ 0, and in order to obtain the proper definition of the 
vacuum charge we have to subtract a quantity which, generally speaking, depends on Z (<5(-oo)). 
This subtraction means, in fact, that we have to change the interaction with the external field 
when E -oo so that 5(— oo) = 0. (J(— m) can always be considered to be zero if the field is 
small). If, however, Z will be increased and becomes more than critical, then, as we saw, the levels 
will pass the point E = —m and move to the complex plane. It is easy to check, that every time 
this happens the phase 5{—m) is changing by tt and the move of n levels into the complex plane 
changes the number of states by n so that the charge of the vacuum (i.e. of the atom) becomes 
Z — n. The value of pe which enters the equation for the self-consistent field is to be defined by 
the contribution of levels which passed through E = —m. 

We discussed in detail this concrete structure in order to refer to it in the following, talking about 
the heavy quark in the vacuum of the light quark. We shall suppose, that due to gluonic vacuum 
polarization the effective coupling a(r) which at small r has the usual perturbative behaviour, 
reaches a constant value at r >> ro (ro ^ 1/A), and this constant value will be more than unity. 
(Without this ansatz, allowing that a continues to grow, things are more complicated, but nothing 
essentially changes). 


a(r) 



Figure 1 

The supposed behaviour of the effective coupling. 


It is natural to expect such a behaviour of the charge in QCD. But even an Abelian theory can 
reveal such a behaviour of charge, if it originates from a non- Abelian theory via spontaneous colour 
symmetry breaking. In this case the charge will increase, and as a result, 6-7 gluons acquire masses. 
After that there remain one or two massless objects - ‘‘photons” - and the behaviour of charge will 
be exactly as discussed, because after the symmetry breaking we have = ~ (heavy gluon). We 
can ask now : what happens in the vacuum of light quarks under these circumstances ? Outside 
the region where a is growing, we will have an Abelian theory and we can consider the quark states 
in the normal way. 



184 


Vladimir N Gribov 


a(r) 



If a becomes more than critical, the corresponding level goes to the Dirac sea, which, consequently, 
will have to be filled up and we will find an atomic type state. If this is so, the charge density will 
be exactly the same as in the case of a supercharged nucleus. The Xqcd is analogous to the radius 
of the nucleus. There is, however, an important difference between QED and this case. Indeed, in 
QED we consider a nucleus with a charge Z in the centre, and we put one or two electrons around 
it to organize an atom. In QCD we have a heavy quark with a very small intrinsic charge. 



Due to vacuum polarization the charge becomes large inside 1/A. The system creates an empty 
level, and we have to fill it. This means, that we have to add an intrinsic charge, equal to that 
of the heavy quark, but with the opposite sign (outside the region of 1/A). So we have here two 
intrinsic charges with the total charge zero, and of course the vacuum polarization can never change 
the total charge. In other words, in QCD the supercritical atom would be a meson-type state with 
zero colour charge. Our task is to show, that these are not only words - we have to include the 
mechanism of vacuum polarization formally. In order to do so, let us discuss the problem in a 
language similar to QED, neglecting non- Abelian fluctuations of the colour field A - the average 
field of heavy and light quarks inside the meson. In this case Ao is defined by the equation 

= el {ph - pi) , (8) 

where ph is the density of the heavy quark, and pi that of the light quark. Again, we can write 
the equation for the Green function 

G{x^x') = Td(ir) . (9) 

But now the problem is, how to calculate this. What means, e. g., pheavy ? We can not use the 
expression ph = eo<5(f), since it does not include the bosonic vacuum polarization, which has to 
be taken into account. It is well known, that in principle vacuum polarization means summing a 
diagram like 



Lectures On ConGenement [II] 


185 




with gluonic loops inside. The problem is, however, how to write this in normal space-time lan- 
guage. There is a good way to sum this diagram by writing an equation for currents. Knowing the 
external current and wishing to calculate the total current, we have to solve the equation 

Ux) = f;\x) + ^ls'{ix- x'f) Md^x' , ( 10 ) 

which corresponds to the summation of the diagram. We will not derive this equation, because it 
is almost obvious for the static charge which we are interested in. 

Let us consider a static charge, not depending on time. In this case 

p{r) = pT^t) + ^J y:^P(r') , |r - r'l > e , (11) 

where 1/e is the ultraviolet cut-off, 6 = -yric ~ |n/. 

The solution of this equation is very simple and leads to the usual expression for charge renor- 
malization in QCD. In order to see this, let us introduce the quantity Q{r) : 

(3(r) = f p(r')47rr'^dr' . (12) 

Jo 

The logarithmic derivative of Q(r) is 

d^Q{r) = = 47rr^p(r) . (13) 

For Q{r) we can write the equation 

a<«(r) = dtQ.„{r) + ■ (M) 

The integration in the right hand-side of this expression contains two logarithmic regions : |r - 
r'l << r and r >> r' . In the first region, d(>Q(r') can be substituted by d^Q(r) ; integrating over 
the second region, r' in the denominator |r — r'p can be neglected. As a result, we obtain 

asOM = 8s Q„,(r) + g jf' e(r) + w , 

which is equivalent to 


or 


For a point-like charge 



27 r e 


Q{r) = Qext{r) 


Qir) = 


Q ext (^') 
1 - 


(15) 


(16) 


Qextir) = 1 



186 


Vladimir N Grihov 


and we have 


A{r) = = a{r)Aext 

r 

The concrete expression 


(17) 


a(r) = 


OiO 




(18) 


obtained from perturbation theory has, of course, an unphysical singularity. For a point-like charge 
(18) has to be substituted by an expression corresponding to the behaviour of a(r) as shown in 
Fig. 1. However, for the distributed charge the relation between the external field and the field 
which takes into account the polarization is non-local in coordinate space. The correct expression 
for the relation between the external field and the observable field is local in the momentum space : 


A{q) = a{q^)Aext{q) (19) 

which leads in the coordinate space to an expression of the following type : 

A{r) = j K(r -r')Aext{r')d^r' , (20) 

where 

Similarly, 

p{r) = j K{r-r')pext{r')d^r' . (21) 

If we now suppose, that a{q) as a function of l/q behaves according to Fig. 1, and at large q values 
it is defined by perturbation theory, then the equation (8) has to be understood as an equation for 


V^Aej.t(r) = 5(r) - 'tpe{r)'foi)i{r) (22) 

where 'ipi{r) is the solution of the Dirac equation 

{H A)i; = Eip 

for the light antiquark in a superbound state in the field A{r) defined by equation (20). The 
solution of this problem gives the energy and the features of the meson qnqe with zero total charge. 
Due to (21), 


Q = j K{r)d^r Q^xt = a{q = 0)Qext (23) 

with Q = 0 if Qext = 0. 

We have just proved, that because of the big charge which appears through vacuum polarization, 
in the case of QCD the “atomic” bound state will, indeed, be a meson. The heavy quark will decay 
into a qhqi-meson and a light quark : 


qh-^M {qhqi) + qi . (24) 

In the next lecture we shall consider light quark states. So far there is one important thing to 
stress, namely if we don’t include essential interactions between light quarks in the vacuum, we 
come to a reasonable conclusion for the case of the heavy quark but, as we will just see, to an 
unreasonable one about the light quark. Indeed, let us try to extend the considered procedure to 



Lectures On Confienement [II] 


187 


the latter Cctse. Suppose, that there is a light quark moving, and a potential acts on its vacuum. 
What will we see classically ? Since the Coulomb field is a vector field, it is shrinking, but the 
total integral remains the same. Because of this, we will find immediately, that there is a bound 
state in this potential even for fastly moving particles. This, however, means, that we have here 
an unstable state, which has to be filled, and as a consequence the light quark will decay into a 
meson and a light quark again : 


qi M Qi 

which, of course, contradicts the energy conservation, unless the appearing meson is of negative 
energy. In order to have a self-consistent picture, we have to suppose that the light quark in 
the vacuum will interact so strongly that there have to be negative energy levels and the whole 
vacuum has to be rearranged. So from the picture we described we come quite naturally to light 
quark interactions. We will see, that these interactions are, indeed, very strong and lead to the 
confinement of light quarks which will take place at relatively small - values ; this means, that 
the overall corrections for vacuum polarization will not be large. 

This is for the future. What we have to add now, is, that even in the language which was 
accepted so far, with no strong interactions between particles in the vacuum, the problem in real 
QCD which is non- Abelian is more complicated. In QED we have one charge, and all the electrons 
in the vacuum interact with this charge, independently from each other. In QCD this can take 
place only, if the field of the heavy quark is an Abelian one. It is highly probable, that this is, 
indeed, the case, when the field of the heavy quark becomes large as a result of gluonic vacuum 
polarization. 


References 

[1] V.N. Gribov, Orsay Lectures on confinement (I): The theory of supercharged nucleus, LPTHE 
ORSAY 92/60 (1993). 



Orsay Lectures On Confinement (III) 


V. N. Gribov 

L.D. Landau Institute for Theoretical Physics, Moscow 
and 

KFKI Research Institute for Particle and Nuclear Physics, Budapest 

and 

Laboratoire de Physique Theorique et Haute Energies 
Universite de Paris XI, batiment 211, 91405 Orsay Cedex, France 


LPT-Orsay-99-37 

hep-ph/9905285 



Lectures On ConBnement (III) 


189 


Light quark confinement^ ^ 


We have described the confinement of heavy quarks in an analogy with the theory of the super- 
charged nucleus [1,2]. Let us now suppose again that a is behaving like 



Making this assumption, we are neglecting gluon-gluon interactions and the existence of gluons 
as real particles. Our aim is to see, what can arise from the discussion of light quarks only. We 
introduce A corresponding to ac and consider quark masses mo C A. The interactions of light 
quarks (for which mo C A) will be discussed in a rather simplified way. We will take into account 
all possible interactions 



where the gluon propagator (considered as an effective photon), corresponding to the dotted line 

is 

Qf 


Further, we look for a model which enables us to see, what happens to the fermions if there is an 
interaction between them, as indicated above. The question is, how the bound states or the Green 
function behave in such a case. __ 

Let us consider the energy of two quarks, u and d, for example. Without interaction there will 
be positive energy states with E > 2m and negative energy states with E < —2m: 


E{ud) 

2m 


0 


a 


—2m 


^This is the third lecture on quark confinement given by V.N. Gribov in 1992 in Orsay. An extensive discussion 
of the consequences of all this for the structure of the Green function can be found in [5,6] - in the two last papers 
concluding his 20 years long study of the problem of quark confinement in QCD. 

^The text was prepared for publication by Yu. Dokshitzer, B. Metsch and J. Nyiri on the basis of a tape recording 
and notes taken during the lecture 



190 


Vladimir N Gribov 


Introducing the interaction, for small a we will find that there are some bound states near 2m and 
~2m. 

Eiud) 



So far, we consider the usual Dirac vacuum: the negative energy states are occupied, and the 
positive ones are empty. Increasing the coupling, i.e. increasing a, we could expect that the 
magnitude of the energy is decreasing and the levels corresponding to the bound states will come 
closer and closer to zero. 


E{ud) 



With a further increase of the coupling up to a critical value, one possibility for the levels will be 
just to approach the zero line and never cross. There is, however, also a possibility of crossing. We 
will see that the first case corresponds to normal spontaneous symmetry breaking. But, if the levels 
cross, and especially in the most clear case, when they pass the lines 2m and -2m, respectively, 
everything will change and we arrive at very different phenomena: 



Indeed, now the original vacuum which corresponds to the case when level 2 is empty and level 
1 is occupied, is absolutely unstable. We have to fill the new negative energy state and leave the 
positive energy level empty. But by filling this new state, we get an excitation, a mesomtype state 
with a mass /i. For free quarks this would mean that the quark with negative energy decays into a 
negative energy meson (filling the negative energy levels) and creates a positive energy quark. As 
a result, the Dirac picture in which all negative energy levels are filled up and all positive energy 
levels are empty, is destroyed. But if so, a positive energy quark also decays into a positive energy 
meson and a quark with negative energy. This means that both decays 

q” -h q’^ 

-> + q- 



Lectures On Confinement (III) 


191 


are possible, and both q and q~^ are unstable. 

The question is now, how to deal with the bound state problem. Of course, we could just 
start to calculate the bound states, considering the interactions without corrections to the Green 
function. However, one has to take into account that the fermion-fermion interaction changes the 
effective mass of the quarks and this in its turn will change the bound states considerably. 



which makes the problem more complicated. We thus will have to consider bound states and the 
Green functions on equal footing. 

Up to now, the only approach which deals with this problem and is self-consistent is the Nambu 
- Jona-Lasinio model [3]. It considers the fermion Green function corrections due to a four-fermion 
interaction: 



In spite of the strong dependence on the cut-off, the model preserves all symmetries in the Green 

function and in two-particle interactions. Let us present the result of Nambu and Jona-Lasinio 

in a way somewhat different from what is given in [3]. We express it as the dependence of the 

renormalized mass m on the bare mass mo* They found that if the effective coupling (it depends 

2 

on the definition in their case) ^ is less than unity, the curve will be just the usual one: 


m 




192 


Vladimir N Gribov 


If, however, the coupling ^ is larger than unity, the dependence will be like 


m 



mo 




According to the interpretation of Nambu and Jona-Lasinio, the upper part of the curve, which 
at mo = 0 reaches a finite point, corresponds to the spontaneous symmetry breaking. But there 
are three solutions at mo = 0 and at sufficiently small mo values. What Nambu and Jona-Lasinio 
claim is that the lower part of the curve is unstable, and there is a real vacuum. I agree with this, if 
mo > merit- We can ask: what is the source of instability of this curve? The general argumentation 
is the following. Talking about spontaneous symmetry breaking, m is like a magnetic field in a 
ferromagnetic; we just choose a definite direction. But mo is like an external field, and the system 
is like a compass. If the external field and the induced field are pointing at the same direction, the 
situation is stable. If they point to opposite directions, the compass will change. 

I am, however, not sure that the instability of the almost perturbative solution which contains 
no condensate at all can be explained in such a way. The explanation may be right for the 
part b of the curve in Fig.2 which corresponds to a big spontaneous magnetic moment and the 
opposite direction of mo . It does not work for the part a close to perturbation theory which has no 
spontaneous magnetic moment. And, looking more carefully at the curve, we see that the part b 
corresponds to pseudoscalar states inside the Dirac see, while on the piece a both the pseudoscalar 
and scalar states are inside, both levels passed. Recognizing this, one can conclude that indeed, 
the mentioned state is unstable, but for a trivial reason: the corresponding level is inside the Dirac 
see and it is not filled up. The problem is, what happens if we fill up this level. It remains an open 
question, what can be considered as a ground state under these conditions. And it is a problem 
how to get these results in a more self-consistent way, not depending on the cut-off so strongly. 

The Nambu - Jona-Lasinio model can be reproduced in our picture. For this purpose, just as 


^This result is not always quoted, but it is present in their paper. 



Lectures On Con&nement (III) 


193 


a theoretical exercise, let us use a not going to unity, i.e. draw 
a 



instead of the curve in Fig.l. In this case there will be a second scale A2, and outside this scale 
we will have just point-like interaction. This reminds of the Nambu - Jona-Lasinio picture, which, 
apparently, can be reached somehow in our approach. The problem is, how to write constructively 
the corresponding equation, and whether this can be done at all. Of course, this constructive part 
can be only approximate. But if we recognize that it can be written, then we will be able to develop 
a theory in which we put the main ingredient of our discussion as an input into our solution and 
try to find the real construction. The main difference will appear in the analytic properties of the 
Green function. The Green function of a fermion for such a case would be quite different in its 
analytic structure compared to the usual one. 

I am afraid I will not have the time to come to this point today, but I would like to explain 
just the physics. 

How to write the equation? What happens in the real case and how to deal with it? Let us 
start with with the Green function. What do we know, what are we supposed to know about this 
Green function as a function of ? 

renorm group 
equation 

q 


Beyond a certain A in the region where the coupling is small, it is asymptotically free; here the 
Green function has to satisfy the renorm group equation. But if ais a result of the interaction a mass 
is acquired, this mass would be somewhere at smaller here the equation becomes essentially 
very complicated and we are not able to extract a reasonable structure. 

The idea to write an equation which is correct in both regions, near the threshold and at large 
g^, and to match these two solutions, comes from the following consideration. Suppose that cxcritl'^ 
is small: 

Now, however, we may ask: how could new masses, new solutions etc. appear at all at such a 
small a. Obviously, this a has to be multiplied by something large. What happens, for example, 
at large g? We know, that there is always a logarithm of q^/X^ and the real parameter becomes 



194 


Vladimir N Gribov 


which is, in spite of the smallness of a, big enough to change the Green function essentially. But 
near the threshold there is also a logarithm: 


In ■ 




A2 


with some scale A = A or m 


which is always present. In other words, in this region there could be also a quantity which changes 
seriously in spite of the relative smallness of a. Hence, we want to write the equation which is 
correct near the threshold, taking into account correctly the singularity of a supposed mass, and 
after that compare this with the renorm group equation; we shall see whether it is possible to write 
an equation which is correct in both regions, and if yes, we will try to solve it. In order to get the 
singularity correctly, we take the second derivative of with respect to the momenta. 


G ^ == (?o 


(g-W 

/ 

/ 

J. 


\ 

-Jl 


1 G(q') 



( 2 ) 


The contribution of the first term is trivial, the second derivative of g 4- m gives zero. Taking the 
second derivative of the first graph, it can be easily seen that 




( 3 ) 


This gives for the first diagram 7|iG(g)7;xf ~ just by direct calculation. In other words, we make 
it local. Prom this diagram we take the contribution where = (g — g') - the momentum of the 
photon - essentially equals zero. 

Let us now look at the second diagram. We have here the choice of taking the second derivative 
at one of the photon lines, or to differentiate once at one line and once at the second line. Having in 
mind that all the integrations would have a structure which need some logarithmic enhancement, 
it would mean that the most important regions of integration in this integral would be those where 
ki < fc 2 aud fc 2 <C fci* We take the derivative at ki and then set it to be zero, but for k 2 the 
integration will give the same as before. If this integration gives us two logarithms, we kill one 
and recover it after the integration of our differential equation; but we still have the first one. 
But if we differentiate once one line and once the other, we will always sit on the region ki A; 2 , 
because they have to be of the same order. And, in this case, there is no logarithm at all; after 
the integration, we will recover one, but one order will be lost. Clearly, a possible approach is to 
try not to choose different diagrams, but to use the small k region of integration. Ordering the 
integration inside the diagram in such a way that one momentum is much smaller than the others, 
and differentiating this diagram, we will find a relatively simple answer. Indeed, suppose that we 
have any diagrams with any loops. If we differentiate some lines twice (it can be any line) and 
neglect all first derivatives, we get an amplitude of the following structure: 



This is just the Compton scattering of a zero momentum photon fc = 0, and for this quantity the 
most singular contribution is obviously 


r^{0,q)G{q)r^iO,q) 



Lectures On ConSnement (III) 


195 


which corresponds to the diagram 



But the vertex T is at zero momentum and hence r^(0, g) = df^G ^(g). In this approximation 
we can write a very simple equation: 

e2c-^q) = e^d^G-^iq)G{q)d^G-\q) (4) 

which differs essentially from any Bethe-Salpeter type equation. Indeed, using a Bethe-Salpeter 
type equation, we do not change the vertex part and end up with rather bad properties. Equation 
(4) is scale invariant, it is 75-invariant, it has many nice symmetry properties and, what is most 
important, it has a correct behaviour near the threshold. 

The gauge is fixed, because we used 


D 


iXV — 



It is an important question, what we would get in different gauges. In Feynman gauge we are very 
lucky: we find an expression which does not depend explicitly on the expression for the Green 
function. Using a different gauge, we would find the infrared behaviour of this diagram to be more 
complicated, and we would not be able to extract universally the region of small k. We would have 
integrals over q which are also possible to use, but with the necessity to think about the behaviour 
near the threshold. 

We, however, have chosen this gauge; we did not destroy the general features and used the 
current conservation which just corresponds to = dfxG^^. Accepting this, we can now ask, 
what is the relation to the renorm group equation. 

Suppose that we would like to write the renorm group equation in the same spirit. Let us take 
again the second derivatives. In this case we would be definitely correct, because we know that it 
is a logarithmic approximation. 

In our logarithmic approximation we will do exactly the same with the only difference that a 
would be Oi{q^) . In the renorm group equation a is a function of But, of course, a in general 
depends on two momenta: and q ^ . And in the renorm group equation at large momenta, in the 
ultraviolet region, a depends on the variable which is the largest. Since we are close to fc = 0, this 
means that here we will have a(g^), and we will recover the renorm group equation at large If 
we solve this equation with a slowly varying a, we will be correct in both the threshold region and 
the ultraviolet region. 

We also have to formulate an equation for the bound states under the same assumption. Looking 
for bound states, we consider scalar and pseudoscalar vertices. This vertex 




196 


Vladimir N Gribov 


Here ^{q^p) depends on p, the total momentum of a pair, and g, the quark momenta being 
given by g “f-p/2 and g ~p/2. With the same procedure as in obtaining the equation for the Green 
function we find for the vertex (see [4] for some details): 




( 5 ) 


where Afj, = d^G'~^G^ = Gd^G '^^ . It means that we have two equations in this approximation. 

We used this approximation just to be constructive and to study what will result if we make this 
approximation. In principle, solving both equations we will get everything what is necessary: we 
know G and we have a linear equation for bound states, we can see what is the type of the 
energy etc. The equation for the bound states has very nice features. It is beautiful from the point 
of view of the Goldstone theorem in the following sense. 

Suppose I have some symmetry in my equation, e.g. 75-invariance. Since there is no 75 in 
equation (5), it is 75-invariant. But of course the boundary condition for at g 00 is just 
Gq^ - (q - mo), and thus destroys the symmetry. But suppose that mo = 0. In this case there 
would be symmetry here, which means that the Green function will not be unique, since it can be 


where is some solution and 6G'^^ oc 75^'^ . This means that the variation also is 

important. What would be the equation for the variation? If we calculate the variation on both 
sides of equation (4) we obtain 

^ {d^(5G-^)Gd„G-^ + d^G-^Gdf,i5G-^) - d^G-^G{5G-^)Gd^G-'^) , 

so we find that (p = fulfils equation (5) at p = 0. It means that if some symmetry is broken, 
i.e. if there are multiple solutions of the equation for the Green function, we always will have some 
solution of the equation for the vertex at p = 0, which is the Goldstone. 

It is clear, that in the present model we can discuss many questions, use a running coupling a 
as in Fig. 1 and reproduce the NJL-features without any essentials depending on a cutoff. Before 
discussing this point further, we will first look for the solution of (4) and discuss the result 

Above, we introduced A^ = (5^G^^)G, which is a very useful quantity. Since G“^ = -f b is 
essentially a 2 x 2 matrix, A^ is a U (2) gauge potential: 

d^G-^ = d^{{d^G-^]GG-'^) = + A^id^G-^) = -A^d^G''^ ( 6 ) 

where in the last step we used Eq. (4). Multiplying from the right by G we thus find 

dfj,Afj, + Afj,A^ == — Afx,A^. (7) 

This means that 

dfiAfj, = — 

and is a pure C/(2)-gauge potential with a condition /? = 1 - Of course, this is just a useful 
trick. Important is to write down the real equation for the Green function. The most natural thing 
is to express G~^ in the form 

G-'=pe¥^ 

where n is a 2 x 2-matrix 



It is just easier to use this form for our purpose: we can find an equation for p and an equation for 
if. Both are functions of g^: p(g^), <p(g^)- There is, however, no scale in the equation; it contains 
only a derivative of g. If we introduce 

^ = lng, 



Lectures On ConSnement (III) 


197 


we will find an equation in which ^ can be considered as ’’time” , and which is an oscillator equation. 
In fact there are two oscillators, one for />, the other for (p, and they will satisfy non-linear equations. 
For cp we find 


<,5 + 2 



— 3 sinh ^ = 0. 


( 8 ) 


This is just an oscillator with damping; a similar equation can be written for p. Important is that 
that there has to be ’’energy” conservation in this equation. Indeed, we said that ( plays the role 
of time; it, however, did not enter the equation explicitly. Thus there has to be a conservation law 
which, as it is easy to show, leads to 


We thus can eliminate p altogether, and find the equation for p 


(p + 2y 1 - ,5^ ^3 sinh^ ^ 3 sinh cp = 0 , 

which is an oscillator with damping. Having this in mind is sufficient to understand the structure 
of the solution. Indeed, what is this <p ? We have 

G~^ = pcosh^ + -psinh^. (10) 

2 q 2 

The perturbative solution is <p close to in. In this case the first term is zero, the other is proportional 
to q/q - this corresponds to the massless solution. Since mo is small, we have to have solutions like 
this at g -+ 00 . 

Now we have to find the solution everywhere. Let us first investigate the equation without 
damping; we get 

(p — 3sinh<p = 0. 

If we go to the Euclidean space, p = i'l/j, the potential becomes a periodical potential: 



Fig. 4. 

We have to look for a possible solution for this structure with damping. What does this mean? 
For the oscillator with damping any solution at ^ oo has to be in a minimum, because the 
energy is decreasing. But if ^ is going to -oo, the energy is growing. What could be in this case a 
normal, reasonable solution? It is almost clear that the only possibility is to put at f -4 -oo the 
"particles” in this oscillator at the maximum, and start to move them slowly; eventually, they will 
appear inside the well. 



198 


Vladimir N Gribov 


There is a most important question, namely: what is the critical coupling in this case? What 
do we know about an oscillator with damping? If the damping is large enough, all the trajectories 
will go monotonically to the minimum. If the damping is sufficiently small, the solution will start 
to oscillate. In order to see when this transition happens, we have to look for the equation just 
near the minimum ^ = tt. With ^ = tt ~ -0 we have for small (j) 

(j> -f 2\/l 4- 3/3^ -f 3(^ = 0 , 


with fundamental solutions 


01 2 = where 1/1,2 = —v/l + 3/3^ ± sj — 2 . 


So we have monotonic behaviour for 3/3^ — 2 > 0. On the other hand if < |, i.e. 

we will have oscillations before reaching the minimum. The critical angle which separates the 
regions where the solution is monotonic and where it oscillates can be shown (see e.g. (4)) to be 
given by 


.^.2 t-fl.A. /l±^ I 

2 V3 1-/3M + + ' 





Up to now we have considered a constant coupling a. We know that for g > A the Green 
function is determined by perturbation theory, which has to match the solutions in the region of 
smaller g. If | for all the solution which goes as 0 for g -> 0 matches the solution 

. 2 

- it) « ^ ^ for g 00 monotonically. This determines mo as a function of rric in a unique 

way. Let A be the value of q where /3^(A^) = |. If, however, < | below q = X, the solutions can 



Lectures On ConRnement (III) 


199 


oscillate and mo(mcJ = 0 for some me- as indicated in Fig.6. 



Fig. 6 

This then is a solution corresponding to broken chiral symmetry. 

References 

1 V. N. Gribov, Orsay lectures on confinement (I), preprint LPTHE Orsay 92-60 (1993); hep- 

ph/9403218 

2 V. N. Gribov, Orsay lectures on confinement (II), preprint LPTHE Orsay 94-20 (1994); hep- 

ph/9407269 

3 Y. Nambu, G. Jona-Lasinio, Phys. Rev. 122 (1965), 345 

4 V. N. Gribov, Lund preprint LU 91-7 (1991) 

5 V. N. Gribov, QCD at large and short distances, Bonn preprint TK 97-08 (1997), hep-ph/9807224. 

6 V. N. Gribov, The theory of quark confinement, Bonn preprint TK 98-09 (1998), hep-ph/9902279 



7. An Essay on Color Confinement 

Kazuhiko Nishijima *and Masud Chaichian ^ 

(a) Nishina Memorial Foundation 

2-28-45 Honkomagome, Bunkyo-ku, Tokyo 113-8941, Japan 

(b) Helsinki Institute of Physics 
P.O. Box~9, FIN-00014 Helsinki, Finland 


Abstract 

Color confinement is a consequence of an unbroken non-Abelian gauge symmetry and the 
resulting asymptotic freedom inherent in quantum chromodynamics. A qualitative sketch of its 
proof is presented. 


1 Introduction 

There has been an accumulation of evidence in favor of the quark model of hadrons [1] and we can 
no longer think of any other substitute for it. Yet, no isolated quarks have been observed to date, 
and we are inclined to think that observation of isolated quarks is, in principle, impossible. This is 
the hypothesis of quark confinement, and it has been further extended to that of color confinement 
that implies not only the unobservability of quarks but .also of all the isolated colored particles such 
as quarks and gluons. Then a natural question is raised of whether or not we can account for this 
hypothesis within the framework of the conventional quantum chromodynamics (QCD) dealing 
with the gauge interactions of quarks and gluons. The answer to this question is affirmative and 
the detailed mathematical proof of color confinement has been published elsewhere [2-6]. In this 
article, therefore, we shall follow the flow of ideas underlying the proof in a qualitative manner. 

The problem of color confinement may be decomposed into two steps. The first step consists of 
finding a consensus of interpretations of color confinement. Unless it is properly settled we do not 
know what we have to prove in the second step. Because of the importance of this subject many 
authors have proposed various interpretations. A typical example is Wilson’s area law for the loop 
correlation function in the lattice gauge theory [7]. When it is obeyed the interaction between 
a quark and an antiquaxk is given by a confining linear potential. Another example is given by 
coherent superposition of magnetic monopoles in the vacuum state [8-13]. This is (dual to the 
superconducting vacuum based on coherent superposition of charged objects such as the Cooper 
pairs. Corresponding to the superconductor of the second kind a pair of magnetic monopoles can 
be connected by a quantized magnetic flux forming a hadronic string whose energy is proportional 
to the distance between them. Then the situation is similar to the preceding example. 

In these examples one introduces a topological structure through monopoles, strings and in- 
stantons into the configuration space. In the present paper, however, we shall consider a different 
topological structure in the state vector space. For this purpose we look for a known example of 
confinement within the framework of known field theories, and we find a prototype example in 
quantum electrodynamics (QED) [2]. When the electromagnetic field is quantized in a covariant 
gauge, say, in the Fermi gauge, three kinds of photons emerge, namely, transverse, longitudinal 
and scalar photons, but only the transverse photons are subject to observation leaving the other 
two unobservable. We recognize that this is indeed a typical example of confinement, and we may 


*E,inail: nisijima@argusol.phys.su-tokyo.ac.jp 
tEmail : chaichia@pcu.helsinki.fi 



Color ConGnement 


201 


be able to find some clues to color confinement by studying closely the mechanism of confinement 
of longitudinal and scalar photons in QED, For this reason we analyze its mechanism in Sec. 2 so 
that we can generalize it and apply it to QCD. 

One of the profound features of gauge theories is the Becchi-Rouet-Stora (BRS) invariance [14] 
and its introduction is vital to the interpretation of confinement. Therefore, we shall describe some 
of the basic properties of this invariance in Sec. 3. 

The strong interactions described by QCD possess a novel feature called asymptotic freedom 
[15,16], and in Sec. 4 we shall discuss how this aspect of strong interactions drew our attention 
and how the non- Abelian gauge theory entered the game. Finally in Sec. 5 we shall combine BRS 
invariance with asymptotic freedom to prove color confinement. 


2 Quantum Electrodynamics and Indefinite Metric 

When the electromagnetic field is quantized in a covariant gauge, say, in the Fermi gauge, we 
find transverse, longitudinal and scalar photons, but the latter two axe never observed. We may 
interpret it as an example of confinement, and we have at least three alternative ways of explaining 
it. First, we can refer to the representations of the Poincare group for massless particles [17,18]. 
Then, massless particles axe known to have only two directions of polarization no matter what their 
spin is. Thus photons axe always transversely polarized and the same would be true with gluons 
if they could be observed. The second method is to employ the Coulomb gauge by keeping only 
the transverse photons from the start. The remnants of unobservable photons manifest themselves 
in the form of the Coulomb potential. This method is applicable, however, only to the linear 
Abelian gauge theories such as QED. The third and the most useful method is the introduction of 
a subsidiary condition such as the Lorentz condition. 

Quantization of the electromagnetic field in a covaxiant gauge forces us to introduce indefinite 
metric [19] which is inherited from the Minkowski metric. Thus the whole state vector space 
in QED can no longer possess the positive-definite metric, and for the physical interpretation of 
the theory we have to eliminate indefinite metric by imposing the Lorentz condition on the state 
vectors to select observable or physical states. In order to execute this program let us quantize the 
free electromagnetic field in the Fermi gauge and for a given momentum we have four directions of 
polarization, namely, two transverse, one longitudinal and one scalar. Thus we have four kinds of 
photons specified by the directions of polarization. The canonical quantization then implies that 
the scalar photons are represented by negative norm states. This is a consequence of the manifest 
covariance of the quantization of the vector field in the Minkowski space. 

The emergence of indefinite metric indicates that observable states occupy only a portion of 
the whole state vector space called the physical subspace. In order to define such a subspace 
we introduce a subsidiary condition known as the Lorentz condition. Let us consider the fom- 
divergence of the vector field, then it represents a free massless field even in the presence of the 
interactions. We decompose it into a sum of positive- and negative- frequency parts correspondmg 
to destruction and creation operators, respectively. We find that the photons mvolved in th^ 
operator are special combinations of the longitudinal and scalar photons m the amphtude. We 
shall call them a-photons, then an a-photon state has zero norm. We can introduce an alternative 
combination of longitudinal and scalar photons called b-photons in such a way that a ^photon 
state also has zero norm. Thus for a given momentum we have two transverse (t-) “ 

a-photon and a b-photon. Although both an a-photon state and a b-photon state have zero norm, 
their inner product is non-vanishing so that they are metric partners. . . , 

A physical state is defined as such a state that is annihilated by applying the positive frequency 
part of the four-divergence of the vector field. This is the Lorentz condition. Wenan easily verify 
that the S matrix in QED transforms a physical state into another physicd state since it commutes 
with the four-divergence. This is one of the general features of the subsidiary condition. Also we 
can easily verify that the b-photons axe excluded from the physical subspace. Therefore, we ave 
onlv t-photonsLd a-photons in the physical states. Then we can show that the inner product of 
a physicd state involXg at least one a-photon with another physical state vanishes identically. In 



202 


Nishijima and Chaichian 


other words, a-photons give no contributions to observable quantities, and both a- and b-photons 
escape detection. This is the confinement mechanism of the longitudinal and scalar photons. In 
QED only the transverse photons remain observable. In QCD, however, not only longitudinal and 
scalar gluons but also transverse gluons are unobservable. Thus, there are some essential differences 
in the nature of confinement between QED and QCD. In the former case confinement is kinematical 
in the sense that it could be understood without recourse to dynamics of the system, whereas in the 
latter case it is dynamical in nature as the proof depends sensitively on the dynamical properties 
of the system. 


3 Quantum Chromodynamics and BRS Invariance 

As we shall see in the next section strong interactions of quarks are mediated by a non- Abelian 
gauge field corresponding to the SU(3) color symmetry. Thus we shall discuss one of the most 
characteristic features of gauge theories known as the BRS invariance in this section [14]. 

Classical electrodynamics is gauge-invariant. Field strengths expressed in terms of the vector 
field are invariant under the local or space-time-dependent gauge transformations of the latter. 
Given a source term, therefore, the solution of the equation for the vector field is not uniquely 
given, and this non-uniqueness is an obstacle to quantization. In order to overcome this difficulty 
we add to the gauge-invariant Lagrangian a term violating the local gauge invariance. This extra 
term is called the gauge-fixing term and was first introduced by Fermi. Later it has been generalized 
so as to include an arbitrary parameter called the gauge parameter. In the original form introduced 
by Fermi this parameter is equal to unity. 

After quantization we find that we have to introduce indefinite metric into the state vector 
space and that the divergence of the vector field commutes with the S matrix. Because of the 
inclusion of the gauge-fixing term the field equation deviates from the classical Maxwell equation 
by a term proportional to the four-divergence of the vector field. It so happens that a matrix 
element of this four-divergence between two physical states vanishes identically because of the 
Lorentz condition, and the classical Maxwell equation is recovered in the physical subspace. In 
this way we find, despite the introduction of the gauge-fixing term, that expectation values of 
gauge-invariant quantities and the S matrix elements in the physical subspace are independent of 
the choices of the gauge parameter because of the congeniality between the gauge-fixing term and 
the subsidiary condition. In what follows we shall extend this approach to QCD. 

There are many essential differences between QED and QCD, however. The former is an 
Abelian gauge theory described by a linear field equation, whereas the latter is a non-Abelian 
gauge theory described by a non-linear field equation. In both cases the gauge-invariant part of 
the Lagrangian is given by the square of the field strength. So, let us introduce the gauge-fixing 
term in QCD assuming the same structure as in QED. Then we recognize that it does not work 
because observable quantities depend explicitly on the gauge parameter. Another difficulty arises 
from the fact that the four-divergence of the gauge field is no longer a free field, and this prevents 
us from defining its positive frequency part. In other words, the Lorentz condition cannot be 
employed to define physical states in QCD. Thus we are obliged to find a device to overcome these 
difficulties and to this end we shall introduce the Faddeev-Popov ghost fields. 

In order to eliminate the gauge-dependence of physically relevant quantities Faddeev and Popov 
have proposed a procedure of averaging the path integral over the manifold of gauge transforma- 
tions. We skip the mathematical detail here and refer to the original paper [20], but we should 
mention that this procedure resulted in a new additional term in the Lagrangian called the Faddeev- 
Popov (FP) ghost term. This term involves a pair of Hermitian scalar fields, but they are anticom- 
muting and consequently violate Pauli’s theorem on the connection between spin and statistics. For 
this reason they are called ghost fields. Pauli’s theorem is based on three postulates, (1) Lorentz 
invariance, (2) local commutativity or microscopic causality and (3) positive-definite metric for 
state vectors, and the FP ghost fields violate the last one obliging us to introduce indefinite metric 
into the theory. 

Thus we face again the problem of eliminating indefinite metric from the theory with the help 



Color ConSnement 


203 


of an appropriate subsidiary condition to select physical states out of the whole state vector space. 
When physical states are so defined as those that are annihilated by applying a certain operator, 
that operator should commute with the S matrix as does the four-divergence of the vector field in 
QED. In order to find such an operator a novel symmetry discovered by Becchi, Rouet and Stora is 
extremely useful. Although this symmetry was originally utilized in renormalizing QCD, it plays 
an essential role in the proof of color confinement in QCD. 

In a classical gauge theory a local gauge transformation is specified by a function of the space- 
time coordinates called the gauge function and the classical theory is invariant under such a 
transformation. This local gauge invariance is lost when the gauge-fixing and FP ghost terms 
are introduced. Besides, local gauge transformations are defined only for the color gauge field and 
the quark fields, but they are not even defined for FP ghost fields. The BRS transformations for 
the color gauge field and the quark fields are given by replacing the gauge function by one of the 
FP ghost fields in infinitesimal gauge transformations. Since we have a pair of ghost fields we 
introduce, correspondingly, a pair of BRS transformations. Then a question is raised of how to 
define BRS transformations of the ghost fields since their gauge transformations are not defined. 
Fortunately, this problem has a simple but beautiful solution. Their BRS transformations are 
introduced by demanding the invariance of the total Lagrangian under them. 

The total Lagrangian including the gauge-fixing and FP ghost terms is no longer invariant under 
local gauge transformations, but it is invariant under the global BRS transformations. Noether’s 
theorem then tells us that there must be a pair of conserved quantities corresponding to a pair 
of BRS symmetries. They are Hermitian and called the BRS charges. As mentioned before there 
are two kinds of Hermitian FP ghost fields and correspondingly a BRS charge must involve one of 
the ghost fields. In what follows we keep only one of these two charges for simplicity. The BRS 
charge that we keep is anticommuting just as the FP ghost field, and consequently the square of the 
BRS charge vanishes and it is called nilpotent. The Hermiticity and nilpotency of the BRS charge 
would imply indefinite metric since otherwise it would be a null operator [21,22]. The nilpotency 
is important and allows us to introduce the concept of cohomology in the theory. After a long 
detour we are going to introduce an appropriate subsidiary condition. Physical states are defined 
as those states that are annihilated by applying the BRS charge [23]. 

The FP ghost fields do not appear in the conventional QED but we can also introduce them 
although they are non-interacting fields. Then we can combine the Lorentz condition with the 
additional condition implying the absence of FP ghosts to define the physical states. When these 
conditions are satisfied, we can prove that physical states so defined are annihilated by the BRS 
charge in QED. 

The BRS charge is the generator of the BRS transformation and the BRS transform of an 
operator is given by the commutator or anticommutator of that operator with the BRS charge, 
and this transformation is also nilpotent. An operator which is the BRS transform of another 
operator is called an exact operator, then it is clear that the matrix element of an exact operator 
between a pair of physical states vanishes. 

The equation for the non- Abelian gauge field deviates from the classical Maxwell equation and 
in fact the divergence of the field strength plus the color current does not vanish but is equal to 
a certain exact operator, which will be referred to as an exact current hereafter. Therefore, the 
classical Maxwell equation is recovered when we take the matrix element of the field equation 
between a pair of physical states. Furthermore, the BRS charge commutes with the S matrix. 
Thus the scenario in QED is reproduced almost exactly. 

When single quark states and single gluon states are unphysical these particles are unobservable 
and consequently confined. Thus the problem of color confinement reduces to that of proving that 
they are imphysical states. We shall evaluate the expectation value of the exact current in a single 
quark state or a single gluon state. If they should belong to physical states the expectation values 
in these states would vanish identically, so that non-vanishing of the expectation values would be 
a direct indication that these particles are unphysical and confined. 

The four-divergence of the exact current vanishes, and we can give a set of Ward-Takahashi 
identities for Green’s functions involving the exact current [2-4]. By making use of the above set 
of Ward-Takahashi identities we can prove that the expectation value of the exact current in a 



204 


Nishijima and Chaichian 


single colored particle state survives when the exact current as applied to the vacuum state does 
not generate a massless spin zero particle. Therefore, the absence of such a massless particle is 
a sufEcient condition for color confinement [2-4]. In order to check its absence we introduce the 
vacuum expectation value of the time-ordered product of the gauge field and the exact current 
and evaluate the residue C of the massless spin zero pole of the Fourier transform of this two- 
point function. The four-divergence of this two-point function is proportional to this constant C 
except for a trivial kinematical factor, and the divergence can be cast in the form of an equal-time 
commutator. 

By checking this equal-time commutator closely we find that C is the sum of a constant a and 
the Goto-Imamura-Schwinger (GIS) term. The constant a is equal to the inverse of the renormal- 
ization constant of the color gauge field. These constants C and a satisfy distinct renormalization 
group (RG) equations and boundary conditions. We shall not enter this subject here since the 
mathematical detail has been given elsewhere [2-5], but we infer the fact that vanishing of a auto- 
matically leads to vanishing of C and color confinement is realized. Indeed, it has been known for 
some time that gluons axe confined when a vanishes [24,25], but now with the help of the BRS in- 
variance we could conclude that not only gluons but also all the colored particles are simultaneously 
confined. We shall come back to this subject again in Sec. 5. 


4 Asymptotic Freedom 

In this section we shall review briefly how and why our attention was drawn to the non-Abelian 
gauge theory in describing strong interactions. In particle physics strongly interacting particles 
such as nucleons and pions are called hadrons. Hadrons are composite particles of quarks and 
antiquarks, however, and we have to study the origin of the strong interactions of quarks. 

We already know that strong interactions are mediated by the color gauge fleld and the quanta 
of this field are called the gluons since they glue up quarks together to form hadrons. Dynamics 
of quarks and gluons is called QCD as mentioned before. In the sixties experiments on the deep 
inelastic scattering of electrons on protons had been carried out. The differential cross-section had 
been measured by specifying the energy and direction of electrons without observing the hadrons 
in the final states. Then, apart from kinematical factors this differential cross-section can be 
expressed as a linear combination of two structure functions. They are functions of the square of 
the momentum transfer and the energy loss of the electron in the laboratory system. When these 
two variables increase indefinitely the two structure functions tend to be functions of the ratio of 
these two variables except for trivial kinematical factors. This characteristic behavior of structure 
functions is called the Bjorken scaling [26], and it is considered to be an empirical manifestation of 
the properties of strong interactions. What do we learn from this? In 1969 Feynman proposed the 
parton model and assumed that a nucleon consists of point-like partons moving almost freely inside 
the nucleon [27], In order to keep the partons inside the nucleon, however, the four-momentum of 
a parton must be equal to a fraction x of the total four-momentum of the nucleon. The partons 
may be identified with the quarks and since x is identified with the ratio of the two kinematical 
variables referred to in the above the distribution of the fraction x has been shown to be related 
to the structure functions. 

Prom the success of the parton model in reproducing the Bjorken scaling we may infer that 
quarks inside the hadrons are almost free and that the interactions of quarks turn out to be weaker 
at shorter distances. This is a distinctive feature of strong interactions and we may express it in 
the momentum space as follows: The probability of a process involving large momentum transfer 
in strong interactions is small. 

We look for a model satisfying this condition and find that only non-Abelian gauge interactions 
meet this requirement with the help of RG [15,16]. 

The concept of RG was first introduced by Stueckelberg and Petermann in 1953 [28], and it was 
further advanced by Gell-Mann and Low in QED in 1954 [29]. Let us consider a dielectric medium 
and put a positive test charge inside, then the medium is polarized, namely, negative charges are 
attracted and positive ones are repelled by this test charge. As a consequence it induces a new 



Color ConBnement 


205 


charge distribution in the medium. The total charge inside a sphere of radius r around the test 
charge is a function of r and we call it the running charge. The vacuum is an example of the 
dielectric media because of its ability of being polarized — the vacuum polarization. In this case 
the test charge is called the bare charge and the total charge inside a sphere of a sufficiently large 
radius is called the renormalized charge. The running charge is a function of the radius r, but it 
can also be regarded as a function of momentum transfer through the Fourier transformation. The 
bare charge then corresponds to the limiting value of the running charge for infinite momentum 
transfer. 

Gell-Mann and Low have proved on the basis of the RG method that given a finite renormal- 
ized charge the bare charge is equal to a certain finite constant independent of the value of the 
renormalized one or it is divergent [29]. The Bjorken scaling phrased in terms of RG implies that 
the bare coupling constant must be equal to zero. We shall refer to this property as asymptotic 
freedom (AF), and the non- Abelian gauge theory is the only known example in which AF is re- 
alized as clarified by Gross and Wilczek and by Politzer [15,16]. The origin of AF may be traced 
back to the fact that the vector field introduces indefinite metric needed to realize AF and that 
the non- Abelian gauge theory is the only example involving non-linear interactions of the vector 
field. 

Thus starting from the empirical Bjorken scaling we have finally reached the non- Abelian gauge 
theory of strong interactions, namely, QCD. 


5 Color Confinement 

Now we are ready to present the proof of color confinement, at least verbally, by combining argu- 
ments given in preceding sections. 

In QED the square of the ratio of the renormalized charge to the bare one is equal to the 
renormalization constant of the electromagnetic field. It is equal to the inverse of the dielectric 
constant of the vacuum relative to the empty geometrical space. Usually the dielectric constant of 
a dielectric medium is defined relative to the vacuum, but here we define it relative to the empty 
geometrical space or the void. 

This dielectric constant of the vacuum is larger than unity as a consequence of the positive- 
definite metric of the physical subspace, or more intuitively, it is a consequence of the screening 
effect due to the vacuum polarization. Then, let us consider a fictitious caise in which the dielectric 
constant of the vacuum is smaller than unity. In this case we have antiscreening instead of screening 
when a test charge is placed in this fictitious vacuum, and such a vacuum is realized when a pair of 
virtual charged particles of indefinite metric should contribute to the vacuum polarization. In this 
case the running charge would be an increasing function of the radius r at least for small values of r. 
Next we shall consider an extreme case of the vanishing dielectric constant, then a small test charge 
would attract an unlimited amount of like charges around it thereby bringing the system into a 
catastrophic state of infinite charge. Nature would take safety measures to prevent such a state 
from emerging, and a possible resolution is to bring another test particle of the opposite charge. 
The total charge of the whole system is equal to zero and charge confinement would be realized. In 
QED, however the dielectric constant of the vacuum or the inverse of the renormalization constant 
is larger than unity, and the above scenario reduces to a mere fiction. 

The situation in QCD is completely different since it allows introduction of indefinite metric 
in the vacuum polarization and AF is one of its manifestations. In QCD what corresponds to the 
dielectric constants of the vacuum in QED is the inverse of the renormalization constant of the 
color gauge field denoted by a in Sec. 3. If a should vanish we would encounter a scenario similar 
to the one mentioned above and a test color charge would induce an intolerable catastrophic state. 
Tn Sec. 3 we have shown that such a state is excluded by means of the subsidiary condition that 
selects physical states. Therefore, what can be realized are states of zero color charge and this is 
precisely color confinement. Unlike electric charge, color charge is not a simple additive quantum 
number but a member of a Lie algebra su(3), so that physically realizable states should belong to 
the one-dimensional representation of this algebra. Thus the entire problem of color confinement 



206 


Nishijima and Chaichian 


reduces to the proof that the constant a vanishes. 

Before presenting its proof we have to introduce the concept of the equivalence class of gauges 
[2,4,5]. When the difference between two Lagrangian densities is an exact operator we say that 
these two Lagrangian densities belong to the same equivalence class of gauges. For instance, two 
Lagrangian densities corresponding to two distinct values of the gauge parameter belong to the 
same equivalence class. In QCD hadrons are represented by BRS invariant composite operators 
[30-32], and the S matrix elements for hadron reactions are obtained by applying the reduction 
formula of Lehmann, Symanzik and Zimmermann [33] to Green’s functions defined as the vacuum 
expectation values of the time-ordered products of the BRS invariant composite operators. Then 
we can readily prove that the S matrix elements for hadron reactions are the same within the same 
equivalence class of gauges [2,4,5]. Color confinement signifies that the unitarity condition for the 
S matrix in the hadronic sector is saturated by hadronic intermediate states. That means that 
quarks and gluons have no place to show up in the unitarity condition just as longitudinal and 
scalar photons never appeared in the S matrix elements in QED. Therefore, we may take it for 
granted that the concept of color confinement is gauge-independent within the same equivalence 
class. 

Then we come back to the evaluation of the constant a. First, it should be stressed that a 
can be evaluated exactly as a function of the gauge coupling constant and the gauge parameter 
thanks to AF [2,5]. These two parameters define a two-dimensional parameter space, which is then 
decomposed into three domains according to the value of a, namely, zero, infinity and finite. It 
should be stressed here that the existence of these three domains can be proved without recourse 
to perturbation theory. Of these three domains color confinement is manifestly realized in the first 
one, and also in the other two confinement should prevail because of the gauge-independence of the 
concept of color confinement. Evaluation of a by means of RG based on AF is a very interesting 
mathematical problem, but we shall refer to the original paper for the technical detail[5]. 

Finally, it should be stressed that confinement as has been discussed in this paper is realized only 
when we have an unbroken non- Abelian gauge symmetry [2]. When a certain gauge symmetry is 
spontaneously broken the exact current generates a massless spin zero particle as Nambu-Goldstone 
boson and our proof of confinement breaks down. For instance, the electroweak interactions are 
formulated on the gauge group 5/7(2) x /7(1), but spontaneous symmetry breaking reduces the 
gauge symmetry to the Abelian U{1) corresponding to the electromagnetic gauge symmetry. Thus 
the electroweak interactions do not possess any unbroken non- Abelian gauge symmetry and are 
not capable of confining any particle. 

To conclude, we have presented the flow of ideas towards intuitive understanding of the mech- 
anism of color confinement without recourse to mathematical detail, but interested readers are 
encouraged to refer to the original articles. 

The authors are grateful to Professor A. N. Mitra for kindly inviting us to contribute this article 
to the INSA book. 


References 

[1] M. Gell-Mann, Phys. Letters 8, 214 (1964). 

[2] The most extensive review article on the proof of color confinement in the context of present 
approach is found in the article, K. Nishijima, Czech. J. Phys. 46 , 1 (1996). 

[3] K. Nishijima, Int. J. Mod. Phys. A9 , 3799 (1994). 

[4] K. Nishijima, Int. J. Mod. Phys. AlO , 3155 (1995). 

[5] K. Nishijima and N. Takase, Int. J. Mod. Phys. All , 2281 (1996). 

[6] K. Nishijima, Int. J. Mod. Phys. B12 , 1355 (1998). 

[7] K. G. Wilson, Phys. Rev. D14 , 2455 (1974). 



Color ConSnemeat 


207 


[8] H. B. Nielsen and P. Olesen, Nucl. Phys. B6 , 45 (1973). 

[9] Y.Nambu, Phys. Rev. DIO , 4262 (1974). 

[10] S. Mandelstam, Phys. Reports. C23 , 245 (1976). 

[11] G. ’t Hooft, Nucl. Phys. B79 , 276 (1974). 

[12] N. Seiberg and E. Witten, Nucl. Phys. B426 , 19 (1994). 

[13] N. Seiberg and E. Witten, Nucl. Phys. B431 , 484 (1994). 

[14] C. Becchi, A. Rouet and R. Stora, Ann. Phys. 98 , 287 (1976). 

[15] D. J. Gross and F. Wilczek, Phys. Rev. Lett. 30 , 1343 (1973). 

[16] H. D. Politzer, Phys. Rev. Lett. 30 , 1346 (1973). 

[17] E. P. Wigner, Nuovo Cimento 3 , 517 (1956). 

[18] E. P. Wigner, Rev. Mod. Phys. 29 , 255 (1957). 

[19] N. Nakanishi, Prog. Theor. Phys. Suppl. No. 51, 1 (1972). 

[20] L. D. Faddeev and V. N. Popov, Phys. Lett. 25B , 29 (1967). 

[21] K. Nishijima, Nucl. Phys. B238 , 601 (1984). 

[22] K. Nishijima, Prog. Theor. Phys. 80, 897 (1988). 

[23] T. Kugo and I. Ojima, Prog. Theor. Phys. Suppl No. 66, 1 (1979). 

[24] K. Nishijima, Prog. Theor. Phys. 75, 1221 (1986). 

[25] R. Oehme, Phys. Lett. 195B, 60 (1987). 

[26] J. D. Bjorken, Phys. Rev. 179, 1547 (1969). 

[27] R. P. Feynman, Phys. Rev. Lett. 23, 1415 (1969). 

[28] E. C. G. Stueckelberg and A. Petermann, Helv. Phys. Acta 26, 499 (1953). 

[29] M. Gell-Mann and F. E. Low, Phys. Rev. 95, 1300 (1954). 

[30] R. Haag, Phys. Rev. 112, 669 (1958). 

[31] K. Nishijima, Phys. Rev. Ill, 995 (1958). 

[32] W. Zimmermann, Nuovo Cimento 10, 597 (1958). 

[33] H. Lehmann, K. Symanzik and W. Zimmermann, Nuovo Cimento 1, 205 (1955). 




Part B : Topological Aspects Of QFT 

8. Topological Quantum Field Theories - A Meeting Ground For Physicists And 
Mathematicians by R.Kaul 

9. Quantum Field Theory And The Jones Polynomial by Edward Witten (Commun.Math Phys. 
121,351-399; 1989) (reproduced under permission from Springer- Verlag) 

10. Chiral Anomalies In Field Theories by H.Banerjee 

11. Coherent States In Field Theory by Wei-Min Zhang 

12. Pancharatnam, Bargmann And Berry Phases - A Retrospective by N.Mukunda 

13. The Skyrme Model For Baryons by J.Schechter and H.Weigel 




8. Topological QFT - 
A Meeting Ground for Physicists and 
Mathematicians 

Romesh K. Kaul * 

The Institute of Mathematical Sciences, 

Taramani, Chennai 600 113, India 


Abstract 

Topological quantum field theories can be used as a powerful tool to probe geometry and 
topology in low dimensions. Chern-Simons theories, which are examples of such field theories, 
provide a field theoretic framework for the study of knots and links in three dimensions. These 
are rare examples of quantum field theories which can be exactly (non-perturbatively) and 
explicitly solved. Abelian Chern-Simons theory provides a field theoretic interpretation of the 
linking and self-linking numbers of a link. In non- Abelian theories, vacuum expectation values 
of Wilson link operators yield a class of polynomial link invariants; the simplest of them is the 
famous Jones polynomial. Other invariants obtained are more powerful than that of Jones. 
Powerful methods for completely analytical and non-perturbative computation of these knot 
and link invariants have been developed. In the process answers to some of the open problems 
in knot theory are obtained. Prom these invariants for unoriented and framed links in 5^, an 
invariant for any three-manifold can be easily constructed by exploiting the Lickerish- Wallace 
surgery presentation of three-manifolds. This invariant up to a normalization is the partition 
function of the Chern-Simons field theory. Even perturbative analysis of the Chern-Simons 
theories are rich in their mathematical structure; these provide a field theoretic interpretation 
of Vassiliev knot invariants. Not only in mathematics, Chern-Simons theories find important 
applications in three and four dimensional quantum gravity also. 


Email: kaul@imsc.ernet.in 



212 


R K Kaul 


1 Introduction 

Many a time advances in mathematics and physics have occurred hand in hand. Newton’s theory 
of mechanics and developments of the techniques of calculus are a classical example of this phe- 
nomenon. Another example is the developments in differential geometry inspired by the Maxwell 
theory of electromagnetism and Einstein theory of general relativity. A recent glorious example is 
the developments of topological quantum field theories and their relevance to the study of geometry 
and topology of low dimensional manifolds. 

The application of topological quantum field theories reflects the enormous interest generated 
both by mathematicians and field theoreticians in building a link between quantum physics through 
its path integral formulation on one hand and geometry and topology of low dimensional manifolds 
on the other. These are indeed deep links which are only now getting explored. It does appear that 
the properties of low dimensional manifolds can be very successfully unraveled by relating them 
to infinite dimensional manifolds of fields. This provides a powerful tool to study these manifolds 
notwithstanding the ‘lack of mathematical rigour’ in defining the functional integrals of quantum 
field theory. Indeed, an axiomatic formulation of topological quantum field theories has also been 
attempted. 

Toplogical quantum field theories are independent of the metric of the curved manifold on 
which these are defined; the expectation value of the energy-momentum tensor is zero, (T^jy) = 0. 
These possess no local propagating degrees of freedom; only degrees of freedom are topological. 
Operators of interest in such a theory are also metric independent. 

To illustrate how ideas of quantum field theory can be used to study topology of low dimensional 
manifolds, we shall focus our attention here on recent important developments in Chern-Simons 
gauge field theory as a topological quantum field theory on a three-manifold. This theory provides 
a field theoretic framework for the study of knots and links in a given three manifold[l] - [5]. It was 
A.S. Schwarz who first conjectured [3] that the now famous Jones polynomial [6] may be related 
to Chern-Simons theory. E. Witten in his pioneering paper about ten years ago demonstrated 
this connection [2]. In addition, he set up a general field theoretic framework to study knots and 
links. Since then enormous effort has gone into developing an exact and explicit non-perturbative 
solution of this field theory. Many of the standard techniques of field theory find applications in 
these developments. The interplay between quantuni field theory and knot theory has paid rich 
dividends in both directions. Many of the open problems in knot theory have found answers in the 
process. 

Wilson loop operators are the topological operators of the Chern-Simons gauge field theory. 
Their vacuum expectation values are the topological invariants for knots and links which do not 
depend on the exact shape, location or form of the knots and links but reflect only their topological 
properties. The power of this framework is so deep that it allows us to study these invariants not 
only on simple manifold such as three-sphere but also on any arbitrary three-manifold. 

The knot and link invariants obtained from these field theories are also intimately related to 
the integrable vertex models in two dimeqsions[7, 5]. These invariants have also been approached 
in different mathematical frameworks. A quantum group approach to these polynomial invariants 
has been developed[8]. Last decade or so has seen enormous activity in these directions in algebraic 
topology. 

A mathematically important development is that these link invariants provide a method of 
obtaining a specific topological invariant for three-manifolds[2, 9] in terms of invariants for framed 
unoriented links in 5^ [10, 5, 11]. In the following, we shall review these developments. 

Not only in mathematics, Chern-Simons theory has also played a major role in quantum gravity. 
Three-dimensional gravity with a negative cosmological constant, itself a topological field theory, 
can be described by two copies of SU{2) Chern-Simons theory. Even in four dimensional gravity, 
Chern-Simons theories find application. For example, the boundary degrees of freedom of a black 
hole in four dimensions, are described by an SU(2) Chern-Simons field theory. This has allowed 
an exact calculation of quantum entropy of a non-rotating black hole. The formula so obtained for 
a Schwarzschild black hole, while agreeing with the Bekenstein-Hawking formula for large areas, 
goes beyond the semi-classical result. 



Topological QFT 


213 


Before explaining how a field theoretic framework for knots and links can be developed, let us 
start with a brief discussion of knots and links. 


2 Knots and links: an elementary introduction 


What is knot? A smooth non-intersecting closed curve in a three-manifold is a knot. Oriented 
closed curves are oriented knots. A string with its ends joined in the shape of a circle without 
any entanglements is a model for the simplest non-intersecting closed curve called unknot With 
a given knot, we associate a knot diagram obtained by projecting the knot on to a plane with a 
minimum number of double points. In such a diagram over-crossings and under-crossing are to be 
clearly marked. The number of double points in a knot diagram is called its crossing number. A 
few simple knots with low crossing numbers are: 



Knots 


Clearly, for a given minimum number of crossings, there can be more than one type of topo- 
logically inequivalent knots. The number of knots increases rapidly with the crossing number. 
For crossing number 9, there are 49 knots (not distinguishing mirror reflections), for 10 there are 
165 and for crossing number 11 we have 552 knots. For 13 crossings, there are more than 10,000 
different knots. 

What is a link? A collection of a number of oriented non-intersecting loops (knots) is an 
oriented link. A knot then is single component link. Links like knots can be represented by their 
two dimensional projection, the link diagrams with minimum number of double points, but with 
the over-crossings and under-crossings clearly marked. Examples of a few two-component links 
are: 



Links 


To a topologist, length, thickness or precise shape of a knot are not of any interest. Two 
knots or links are to be identified if one can be made to go continuously into other by shrinking 
or stretching or wiggling without snapping the string. There is a minimal set of elementary rules 
which encode these qualitative notions more precisely. These are the three Reidemeister moves 
which do not change the topological type of a link: 



214 


R K Kaul 


'P 


Type I move 



(f) 


Type 11 move 



Invariance under all these three moves is called invariance under ambient isotopy. If a quantity is 
invariant under type II and III moves only, but not under type I moves, it is said to be a regular 
isotopic invariant 

The Reidemeister move III is of particular interest. It represents a defining relation for the 
generators of braids. In addition, it is a graphical representation of the Yang-Baxter relation of 
statistical mechanical models. These facts are not accidental but reflect a deep connection that 
knots and links have with braids and exactly solvable two-dimensional vertex models [7]. In fact 
this connection has been successfully exploited to obtain infinitely many new exactly solvable 
statistical mechanical models[5]. 

Though the Reidemeister rules are so simple, it is not an easy exercise in general to tell whether 
given two knots or links are topologically distinct or not. For example, it took nearly eighty years, 
since the time of knot tables of C.N. Little from the end of last century to the work of K.A. Perko 
in 1972, to recognize that the knots in the figure below are isotopically equivalent [12]: 



Finding mathematical methods for distinguishing knots and links is indeed an important prob- 
lem in knot theory. To this end, some definite invariants, called link invariants^ are associated with 
the links. These are mathematical expressions which depend only on the isotopic type of the link 
and not on any of its particular representations. Some such invariants are in the form of polyno- 
mials. First polynomial invariant was discovered in late twenties by J.W. Alexander [13]. It took 
almost sixty more years before the next one was discovered by V.R.F. Jones [6]. The new invariant 
proved to be topologically more powerful than that of Alexander. For example, unlike Alexander 
polynomial, Jones polynomial does distinguish many mirror reflected knots. Soon after, a two 
variable generalization of Jones invariant was found [14]. Though two distinct Jones polynomials 
do represent two isotopically distinct knots, the converse is not always true. There are examples 
of distinct knots with same Jones polynomial. Still Jones work represents a leap forward in the 
developments of knot theory. What is impressive about the topological field theoretic description 
of knots is that it provides a whole variety of link invariants in a straight forward manner. Of 
these Jones one- variable polynomial and its two- variable generalization are the simplest examples. 

Before starting a discussion of knots and links in terms of quantum field theory, let us make a 
few historical remarks about knots and links in physics. 



Topological QFT 


215 


A few historical remarks: Knots and links first captured the imagination of physicists 
when Lord Kelvin (William Thomas) introduced them as early as 1857 as fluid-mechanical models 
of atoms [15]. Reluctant to accept the prevailing notion of an infinitely rigid point-like atom, he 
thought of atoms as vortex-lines in a perfect homogeneous fluid, the ether. Different sorts of atoms 
were then to differ in accordance with the number of intersections of these vortex rings. “Stability” 
of the atoms in this theory thus is a reflection of the fact that knots do preserve their essential 
knottedness during their movement. Indeed Lord Kelvin would have wanted to develop a new 
theory of gasses, theory of elastic solids and liquids based on the dynamics of these vortex atoms 
- a programme he did not complete nor was considered by later day physicists worth while in this 
context. However, a new area, knot theory^ of mathematics was born. 

Two contemporary Scottish physicists, J.C. Maxwell and P.G. Tait did find Lord Kelvin’s 
hypothesis attractive enough. Tait had hoped to explain the position of lines in the spectrum of a 
chemical element from the knot type representing it. Thus, it was natural for him to attempt the 
formidable task of classifying knots in 3-space. For this he needed some measure of complexity of 
a knot. Thus the concept of the degree of knottedness was introduced. This is what we nowadays 
call crossing number of a knot, a notion already defined above. Tait with this notion of crossing 
number, produced the first knot tables, listing knots in order of their increasing knottedness. If 
atoms had been really knots, we would have been studying these tables instead of the period table 
of chemical elements in our schools. 

Since the pioneering work of these physicists, knot theory was solely investigated by mathe- 
maticians till about ten years ago when physicists came back to it through quantum field theories. 
This brings us to modern field theoretic interpretation of knots in three dimensions. 


3 Abelian Cher n- Simons field theory and knots and links 

In a field theory, the properties of a system of infinitely many oscillators are represented collec- 
tively by a field, <f){x) defined over all the space though the space label x. An action functional 
is prescribed for these fields. For example, for a one-component scalar field (f>{x), say in three 
dimensional fiat Euclidean space E®, the action functional may be taken to be: 

S[^] = ^ 6>^''d^(l>{x)d^<j>{x) , 

where = 1, 2, 3 are space indices and for B?, the metric is flat 5^^ = dia (1,1, 1). For 
a theory defined over a general curved three-manifold endowed with a metric (and its inverse 
this action generalizes to: 

'S'M - \ j g^''ix)df,<f>{x)du(l>{x) , 

where g{x) = det 

Similarly for a vector field A^{x), the gauge field of the Maxwell theory in three dimensions, 
we write the action functional as: 

S[A^] = d^x^/^ 9>^^(x)g‘'^{x)[dt,A^{x)-d^A^,ix)j[daA0{x)-d0Aaix)^ 

Both these actions above are invariant under general coordinate transformations. 

Quantum field theories normally studied, like the examples above, depend on the metric gf^, 
of the three-manifold in which the theory is defined. The metric describes the geometric proper- 
ties, such ais distances, curvature etc. But, here we are interested in attempting a field theoretic 
description of knots and links in such a way that only their topological properties are represented. 
Their size, exact shape, location etc are not of our concern. The topological properties, unlike 
these, do not depend on the metric. Thus we are seeking a field theory which is independent of the 
metric. Such theories are called topological field theories. A simple example of metric independent 



216 


R K Kaul 


field theory is the Chern-Simons gauge theory. Its action in the Abelian version is given (with 
convenient normalization) by: 

kS[A^] = -^ [ d?x €'^''<^A^{x)d,Ao,{x) (1) 

StT 753 

where is a completely anti-symmetric contravariant three-tensor density whose only nonzero 
component is ^ ^ = 1. For definiteness, we shall discuss this theory in a three-manifold 5^. 
Clearly this action is independent of the metric. Also it is invariant under general coordinate 
transformations. Like the Maxwell theory, this theory exhibits a gauge invariance. 

The quantum version of this theory is described by the functional integral representing the 
partition function: 



and for metric independent gauge invariant functionals W[Au] of the gauge field A^{x), we have 
the functional averages (vacuum expectation values of the associated operators); 

(W) = J [dA] W (3) 

Though the action and gauge invariant functionals W do not depend on the metric, there are 
potential sources which can introduce metric dependence in these functional averages. The func- 
tional integration may be thought of to be done by discretizing the space into a mesh. Infinitely 
many ordinary integrals over A^{x) at each point x of the mesh are to be done and finally the 
limit of mesh size going to zero is taken in some well defined manner. This is the usual way we 
understand these infinite dimensional integrals. Further, there is a gauge invariance in the theory, 
which like other gauge theories needs to be fixed by a choice of gauge. Both the choice of mesh 
as well as gauge fixing condition are generically metric dependent. Thus the gauge fixed measure 
of integration [dAn{x)] in a field theory defined on a curved space, in general depends on the 
metric. However, despite these, it can be shown that various metric dependence so conspire in this 
topological theory that they cancel out without spoiling the metric independence of the functional 
averages [16] . 

Now let us give an explicit form of a topological operator W in this Abelian Chern-Simons 
theory. Consider a link L made up of knots Ki, K 2 , Wilson knot operator for each these 

knots Ki is given by exp[i ni dx^ where is an integer measuring the charge on the 

loop. Clearly these are independent of the metric. Then the Wilson link the operator is product 
of all such knot operators: 


S n 

W[L] = JJ exp\i rii dx^Ay^ipc^ 

i=i 


( 4 ) 


If we expand the exponential here, the expectation value (W[L]) is given by the expectation values 
of the various terms in this expansion. This is a non-interacting theory, all these expectation values 
are given in terms of the “two-loop” expectation values only: 


U dx^Af,{x)<b dy^AM. and U dx»^A^{x) i dy^A,{y)) (5) 

JKi JKm JK JK 

Here in the first expression the two loops are distinct in contrast to the second expression where 
both the loop integrals are along the same knot. Clearly, these expressions can be easily evaluated 
in terms of the two-point correlator {A^{x) Aj^{y)). To do this, we can locally identify the region 
containing our link with so that we can use the flat metric in this region. Then x^ 

and y^ are the Euclidean flat coordinates along the two knots Ki and Km respectively. This allows 
us to do away with the complications connected with the curved nature of the three-manifold 
we can do all our calculations in flat Euclidean space without loss of generality. Elementary field 



Topological QFT 


217 


theory allows us to read off the flat space two-point correlator from the action (subject to a gauge 
condition, which we choose to be the covariant Lorentz gauge df^Ai, = 0 ); 

{A^ix) A,{y)) = 1 

so that 

{<f dx^A^{x)(f dy''A^{y)) = ^ L{Kt,Km) 

^ (f dx^ (f dy^'e^j.poc ^ ( 6 ) 

This double loop integral over two distinct knots {Ki ^ Km) is a well known topological invariant, 
called Gauss linking number of the two closed curves. It measures the number of times one knot Ki 
goes through the other knot Km^ Clearly, linking number of two knots is an integer. For example, 
for the right-handed Hopf link J?+, 



Right-handed Hopf link H+ 


its value is +1. Its value for the mirror reflection of this link (left-handed Hopf) is -1. Linking 
number does not depend on the exact location of the two knots, nor on their size or shape. 
It depends only on their topological relationship with each other. This invariant has a physical 
interpretation due to Maxwell - in electrodynamics, it represents the work done to move a magnetic 
monopole around one knot in three-space while an electric current runs through the other knot. 

The Abelian Chern-Simons theory also provides a representation for yet another simple topo- 
logical quantity associated with an individual knot called its self-linking number and also some 
times framing number or simply framing. This is related to the second expectation value given in 
(5) where the two loop integrals are over the same knot. This expectation value is to be evaluated 
through a limiting procedure: To a knot K parametrized by x^{s) (0 < s < L) along the length 
of the knot by the parameter s, associate another closed curve Kf, called its frame, given by co- 
ordinates = x^{s) -b en^(s) where e is a small parameter and n^{s) is a unit vector field normal 
(principal normal) to the curve at s. That is, Kf is the curve K displaced along the normal by a 
small amount. Then the linking number of the curve K and its frame Kf is called the self-linking 
number SC{K) of the knot: 

dx^'Af,(x) ^ dy"' A^{y)) = lime-^o{^^dx>^Af,{x) dy‘'A^{y)) 

= ^ £(K, Kf) = ^ SC{K) (7) 

This self-linking number is independent of the parameter e and can easily be shown to obey the 
following important theorem, first proven by G. Calugareanu almost forty years ago [17]: 

Calugareanu theorem: The self-linking number of a knot is the sum of its twist and writhe 
numbers: 

SC{K) = T{K)-hw{K) 



w{K) 



ds dt 


( 8 ) 



218 


R K Kaul 


where the vector field is given by 

e^(5, 

is a map K ^ K \ — > 5^ and n^{s) is 
K (x^(s),0 <s<L). The quantities T(K) and w(K) represent well defined geometric properties 
of the knot. T{K) represents the twist in the knot K with reference to its frame Kf and w{K) 
is the amount of writhe or coiling of the knot. Clearly, the twist number and writhe number are 
not necessarily integers nor are they ambient isotopic invariants. But their sum, the self-linking 
number, is indeed an integer and also an ambient isotopic invariant. This theorem can be easily 
appreciated if we recall that stretching a coiled up telephone cord reduces its coils but increases 
its twist and loosening of a twisted cord coils it up. The amount of coils lost (or gained) is exactly 
the same as the amount by which the twisting is gained (or lost) so that their sum is always 
unchanged. This theorem of Calugareanu when applied to circular ribbon (which can be thought 
of as a framed closed curve) has been put to good use in the study of the properties of circular 
polymers and circular DNA [18]. 

Notice that the self-linking number does carry dependence on the frame. The mathematical 
concept of framing of a knot is intimately connected to the concept of regularization in field theory. 
In order to avoid the coincidence singularity in the two-point correlator limx-i^y{A^{x) Aj,[y)), we 
need to regularize it, say by point-splitting. Evaluating, ‘two-loop’ correlator of Eqn.(5), where the 
two loops are same, we face this same divergence, which, through framing, has been resolved by 
‘loop-splitting’. Ordinarily, quantities' in field theory do depend on the regularization. Like- wise 
the self-linking number here depends on the framing. But all those framing curves enveloping 
around the knot, which can be continuously deformed into each other without snapping the knot, 
form a topological class for which the self-linking number does not change. In field theory language, 
framing provides a topological regularization. 

Now collecting all these pieces of information, the expectation value of the Wilson link operator 
for a link L = (Ki, 1^2, ...if^) in the Abelian Chern-Simons theory on can be written down in 
terms of the linking and self-linking (framing) numbers as: 


'' l2/W-a;(s)| 

the normal vector field along the length of the curve 


{W[L]) = exp 




nj SC{Ke) + ^ ntnmL{Kt,Krr, 

l^m 



(9) 


Thus, we have indicated here how this simple field theory does indeed, through expectation 
values of Wilson link operators, provide a field theoretic interpretation of some of the topological 
invariants, linking number and self-linking number of knots and links. Non-Abelian Chern-Simons 
theories are much richer in their structure; these capture even more complex topological properties 
of knots and links. 


4 Non-Abelian Chern-Simons field theory as a description 
of knots and links 


A non- Abelian Chern-Simons theory, instead of being a gauge theory of one vector field, involves, 
say for gauge group 5C7(2), three such fields, A® (a = 1,2,3). These three are collectively written 
as a matrix valued vector field A^ = A" where anti-hermetian matrices are the generators 
of the group SU{2), Action functional defined in a three-manifold, say 5^, is given by: 


kS = 


^ [ d^x tr\Afj,{x)di,Aa{x) -f- ^A^{x)Aj,{x)Aa{x) 
47r Jgs I 6 


( 10 ) 


Like Abelian Chern-Simons theory, this action has no metric dependence. Besides a gauge invari- 
ance, it is also invariant under general coordinate transformations. 



Topological QFT 


219 


The topological operators are the Wilson loop (knot) operators defined as 

Wj[K] = tVjPexp^ (11) 

for an oriented knot K carrying spin j representation reflected by the associated representation 
matrices Tj {a = 1,2,3). The symbol P stands for path ordering of the exponential. This is 
done by breaking the length of the knot K into infinitesimal intervals of size dxff^ around the points 
labeled by the coordinates xf/^ along the knot. Then path ordered exponential is: 

Pexpi dx^AlTf = ^[{1 + dx'^Al{xm)T^] 

m 

For a link L made up of oriented component knots Ki^K^, ■ ■ -Kg carrying spin 
representations respectively, we have the Wilson link operator defined as 

= n ( 12 ) 

We are interested in the functional averages of these operators: 

ynh-.dAL] = Z-^ , where Z = j[dA]e^’^^ (13) 

Here the integrands in the functional integrals are metric independent. So is the measure [16]. 
Therefore, these expectation values depend only on the isotopy type of the oriented link L and the 
set of representations ji, j 2 • • - is associated with component knots. 

These expectation values can be obtained non-perturbatively. For example, for knots and links 
carrying only the spin 1 /2 representations, Witten has shown that the link invariants (expectation 
values of the associated Wilson link operators) satisfy a simple relation. This relation is given for 
three link diagrams which are identical every where except for one crossing where they differ in 
that it is an over-crossing (jL+), or no-crossing (To) or an under-crossing (L«) as shown in the 
figure below:. 


X I I 

Over-crossing No-crossing 

Then the invariant for such links are related as: 

q V^„[L+] - 5 -^U/ 2 [L_] = (gV2 _ g-i/2) v,i 2 [U] (14) 

where g is a root of unity related to the Chern-Simons coupling k through the relation q = 
exp[2'Kil[k -f 2)]. This is precisely the well known generating skein relation for the Jones poly- 
nomials. Indeed Vi/ 2 [L], which is the expectation value of the Wilson link operator where every 
component knot carries the doublet spin 1/2 representation of SU{2), is the one-variable Jones 
polynomial. 

The above skein relation is powerful enough that it recursively yields Jones polynomial for any 
arbitrary link. For example consider following three link diagrams: 



L 



Under-crossing 



220 


R K Ka.ul 


We use an important factorization property of these invariant: the link invariant of two distant 
(disjoint) links (that is, with no mutual entanglement) is simply the product of invariants for the 
individual links. That is, for the link Lo above, Vi/ 2 \U'JU] = where symbol U represents 

the unknot. Then use of the skein relation yields: 

q - q-^V^, 2 [U] = {Vii 2 [U]f 

SO that spin 1/2 invariant for an unknot is given by: Vi/ 2 [t^] = 

Next apply the skein relation to three links, where the Lj^ is the right-handed Hopf link, L_ is 
simply the union of two (unlinked) unknots and Lq is an unknot: 


This 

Now 




Right-handed Hopf link Two unknots U U U Unknot U 

yields, the invariant for the right-handed Hopf link as: Vi/ 2 [Hj^] = 1 + q~^ + q~‘^ 
use recursion relation for the three links: 





where L+ is a right-handed trefoil (T^), L- is an unknot and Lq is a right-handed Hopf This 
gives us the invariant for the trefoil knot as Vi/ 2 [T^] = q~^^^ + This way 

invariant for any arbitrary link can be recursively obtained. 

Jones polynomial is in fact the simplest of the examples of a whole host new link invariants 
that emerge naturally from this field theory. More general invariants are the expectation values 
of Wilson link operators with arbitrary spin representations placed on the knots. The formalism 
does also allow for placing different representations on each of the component knots. This leads to 
so-called coloured polynomial invariants. Besides, instead of the gauge group 5!7(2), Chern-Simons 
theory based on any other semi-simple group can be used. These then yield even richer spectrum 
of the new invariants. 

While Jones polynomial can be obtained by recursive use of the skein relation, other more 
general invariants (for spin representations j = 1,3/2...) can not he obtained in this manner. Of 
course there are generalizations of the skein relations for an arbitrary spin invariants. But these do 
not possess recursively complete solutions (except for spin 1/2 case above). Therefore methods had 
to be developed to obtain expectation values of Wilson operators with arbitrary representations 
living on the component knots of a link. One such method in its complete manifestations has been 
presented in ref [4]. This allows us to present a complete and explicit solution of the Chern-Simons 
theory. This is a non-perturbative method which, generalizing the formalism set up by Witten, 
makes use of two ingredients, one from quantum field theory and other from mathematics of braids: 

(i) Field theoretic input: Chern-Simons theory on a three-manifold with boundary is essentially 
characterized by a corresponding two dimensional Wess-Zumino conformal field theory on that 
boundary [2]: 



Topological QFT 


221 



SU(2) CS theory with coupling k on M SU(2) WZ theory on I 


And Chern-Simons functional average for Wilson lines ending at n points in the boundary is 
described by the associated Wess-Zumino theory on the boundary with n punctures carrying the 
representations of the free Wilson lines; 



SU(2) CS theory with coupling k on the manifold M SU(2) ^ WZ theory on S with n punctures carrying 

with Wilson lines ending at n points in the boundary E primary fields in representations j p j 2 » J n 

The Chern-Simons functional integral can be represented [2] by a vector in the Hilbert space H 
associated with the space of n-point correlator of the Wess-Zumino conformal field theory on the 
boundary E. In fact, these correlators provide a basis for this boundary Hilbert space. There are 
more than one possible basis. These different bases are related by duality of the correlators of the 
conformal field theory [4]. 

(a) Mathematical input: The second ingredient used is the close connection knots and hnks 
have with braids. An n-braid is a collection of non-intersecting strands connecting n points on a 
horizontal plane to n points on another horizontal plane directly below the first set of n points. 
The strands are not allowed to go back upwards at any point in their travel. The braid may be 
projected onto a plane with the two horizontal planes collapsing to two parallel rigid rods. The 
over-crossings and under-crossings of the strands are to be clearly marked. When all the strands 
are identical, we have ordinary braids. The theory of such braids, first developed by Artin, is 
well studied. These braids form a group. However, we may wish to orient the individual strands 
and further distinguish them by putting different colours on them. These different colours are 
represented by different SU{2) spins. These braids, unlike braids made from unoriented identical 
strands, have a more general structure than a group. These instead form a groupoid. The necessary 
aspects of the theory of such braids have been presented in ref. [4] 

One way of relating the braids to knots and links is through closure of braids. We obtain the 

closure of a braid by connecting the ends of the first, second, third, strands from above to the 

ends of the respective first, second, third, strands from below as shown in (A): 



(A) Closure of a braid (B) Platting of a braid 


There is a theorem by Alexander [19] which states that any knot or link can be obtained as closure 
of a braid. This construction of a knot or link is not unique. 

There is another construction associated with braids which relates them to knots and links. 



222 


R K Kaul 


This is called platting. Consider a 2 m-braid, with pairwise adjacent strands carrying the same 
colour and opposite orientations. Then connect the { 2 i - l)th strand with (22)th from above as 
well as from below. This yields the plat of the given braid as shown in (B) above. There is a 
theorem due to Birman[ 20 ] which relates plats to links. This states that a coloured- oriented link 
can be represented (though not uniquely) by the plat of an oriented- coloured 2 m-braid. 

Use of these two inputs, namely relation of Chern-Simons theory to the boundary Wess-Zumino 
conformal field theory and presentation of knots and links as closures or plats of braids leads to 
an explicit, complete and non-perturbative solution of the Chern-Simons theory. Conformal field 
theory on associated boundary gives matrix representations for braids and platting or closing of 
a braid corresponds to taking a specific matrix element of these braid representations. This then 
yields the expectation value of the Wilson link operator associated with that link. For example 
this invariant for an unknot U carrying spin j representation turns out to be: 


Vi[U] [ 2 j + l] where M = ^ 

The square bracket indicates a ^-number. Jones polynomial above corresponds to spin j = 1/2. 
And for a right-handed trefoil the invariant turns out to be: 

[2m + l] 

m=0,l,2,..mm(2j,A:— 2j) 


For j = 1 / 2 , this agrees with the polynomial obtained above by using the skein relation. 

The link invariants calculated from the field theory depend on the regularization used to define 
the coincident loop correlators, that is, the framing of the knots. The invariants above have been 
obtained in a specific framing called standard framing. In particular, the skein relation for spin 
1/2 invariants given above is in this framing. In this framing, the self-linking (framing) number 
of every knot is zero. The invariants so obtained are unchanged under all the three Reidemeister 
moves. That is, this yields ambient isotopic invariants. There is another framing choice which 
has been of special interest. In this case, the frame is thought to be just vertically above the two 
dimensional projection of the knot. In this framing, known as vertical framing, Reidemeister moves 
II and III do leave the link invariants unchanged, but Reidemeister move I changes them. 

The general framework developed provides a powerful method of calculating knot and link 
invariants. This has in the process also provided answers to some of the open problems of knot 
theory. For example, one such problem is to find polynomial invariants which would discriminate 
between two chiralities of a given knot. The invariants for the mirror reflected knots are give by 
simple complex conjugation. Up to ten crossing number, there are six chiral knots, 942, IO485 IO7I) 
IO91, IO104 2tnd IO125 (as listed in the knot tables in Rolfsen’s book [ 21 ]) which are not distinguished 
from their mirror images by spin 1/2 (Jones) pol3uiomials. Spin one (Kauffman/ Akutsu-Wadati) 
polynomials do detect the chirality of four of them, namely IO48, IO91, IO104 and IO125. But for 
042 and IO71 both Jones and Kauffman polynomials are not changed under chirality transformation 
(g g”^). However, the new spin 3/2 invariants are powerful enough to distinguish these knots 
from their mirror images [2 2] . 

Another problem of knot theory that has been provided with an answer is to do with so called 
mutant knots. A mutant of a knot or link is obtained in the following way: isolate a portion of the 
knot in such a way that it has two strands going into and two strands leaving from it. Scoop it 
out and rotate it through tt about any of three orthogonal axes (rotations about only two of these 
are really independent) . Glue it back after, if necessary, changing the orientations on the strands 
to match the free ends of strands of the rest of the knot to which the free ends of the rotated 
portion are glued. This yields a mutant of the original knot. It has been possible to prove that 
polynomial invariants obtained from a Chern-Simons theory based on any arbitrary non- Abelian 
gauge group do not distinguish isotopically inequivalent mutant knots[ 23 ]. As an example consider 
the following sixteen crossing mutant knots: 



Topological QFT 


223 



A 16 crossing mutant pair 


The two knots are related by a mutation of the portion indicated by dashed enclosure. Like all 
other mutants, the invariants obtained from any non- Abelian Chern-Simons theory for them are 
identical. What is of particular interest about this pair is that one of them is chiral, other is not. 
This then yields an example of a chiral knot whose chirality can not be detected by any of these 
invariants. 

The general framework developed to study knots and links is also applicable to another set of 
gauge invariant operators called graphs. For SU{2) Chern-Simons theory, these are the graphs 
containing vertices with three legs. The edges of the graph between vertices carry Wilson line 
operators. More general gauge invariant operators which include links attached to the edges of 
graphs can also be evaluated in this framework. 


5 Three-manifold invariants 

The invariants of knots and links in 5® obtained from the Chern-Simons theory can be used to 
construct a special three-manifold invariant [2, 9, 10, 5]. This provides an important tool to study 
topological properties of three-manifolds. Starting step in this construction is a theorem due to 
Lickorish and Wallace [24, 21]: 

Fundamental theorem of Lickorish and Wallace: Every closed, orientable, connected 
three-manifold, can be obtained by surgery on an unoriented framed knot or link [L, f] in S^. 

As described earlier, the framing / of a link L is defined by associating with every component 
knot Ks of the link an accompanying closed curve Kgf parallel to the knot and winding n{s) times 
in the right-handed direction. That is, the linking number lk{Ks,Ksf) of the component knot 
and its frame (self-linking number of the knot Kg) is n{s). For the construction of three-manifold 
invariants, we use vertical framing where where the frame is thought to be just vertically above the 
two dimensional projection of the knot as shown below. This is some times indicated by putting 
n{s) writhes in the strand making the knot or even by just simply writing the integer n(s) next to 
the knot as shown below: 



VERTICAL FRAMING 


Next the surgery on a framed link [L, f] made of component knots Kr with framing 

/ = (n(l),n(2), .... n(r)) in is performed in the following manner. Remove a small open solid 
torus neighbourhood Ns of each component knot Ks, disjoint from all other such open tubular 
neighbourhoods associated with other component knots. In the manifold left behind — {Ni U 
N 2 U .... Nr), there are r toral boundaries. On each such boundary, consider a simple closed curve 
(the frame) going n{s) times along the meridian and once along the longitude of the associated 



224 


R K Kaul 


knot Ks- Now do a modular transformation on such a toral boundary such that the framing curve 
bounds a disc. Glue back the solid tori into the gaps. This yields a new manifold M^. The theorem 
of Lickerish and Wallace assures us that every closed, orientable, connected three-manifold can be 
constructed in this way. 

This construction of three-manifolds by surgery is not unique: surgery on more than one framed 
link can yield homeomorphic manifolds. But the rules of equivalence of framed links in which 
yield the same three-manifold on surgery are known. These rules are known as Kirby moves[25]. 

Kirby calculus on framed links in S^: Following two elementary moves (and their inverses) 
generate Kirby calculus: 

Move I. For a number of unlinked strands belonging to the component knots with framing 
n(s) going through an unknotted circle C with framing -1-1, the unknotted circle can be removed 
after making a complete clockwise twist from below in the disc enclosed by the circle C: 



In the process, in addition to introducing new crossings, the framing of the various resultant 
component knots, K' to which the affected strands belong, change from n(s) to n'(s) = n(s) - 
ilk{Ks,C)f. 

Move 11. Drop a disjoint unknotted circle with framing -I without any change in the rest of 
the link: 



Thus Lickerish- Wallace theorem and equivalence of surgery under Kirby moves reduces the 
theory of closed, orientable, connected three-manifolds to the theory of framed unoriented links 
via a one-to-one correspondence: 

/ Framed links in modulo \ f Closed, orientable, connected three— \ 
\equivalence under Kirby moves J manifolds modulo homeomorphisms J 

This consequently allows us to characterize three-manifolds by the invariants of the associated 
unoriented framed knots and links obtained from the Chern-Simons theory in S^. This can be 
done by constructing an appropriate combination of the invariants of the framed links which is 
unchanged under Kirby moves: 

/ Invariants of a framed unoriented link\ _ f Invariants of associated\ 

\which do not change under Kirby moves / ~ \ three — manifold J 

One such invariant has been constructed in ref [5]. It is given in terms of invariants for un~ 
oriented links obtained from SC/ (2) Chern-Simons theory. The link invariants discussed in Sec.4 
above are obtained in standard framing. These are sensitive to the relative orientations of the 
compoi^ent knots. Here we shall use invariants for unoriented links in vertical framing. But, unlike 
the invariants in standard framing which exhibit ambient isotopic invariance, those obtained in 
vertical framing have only regular isotopic invariance. That is, in standard framing, a writhe can 
be stretched (a Reidemeister move I) without affecting the link invariant, in vertical framing this 



Topological QFT 


225 


is not so. The link invariant gets changed by a phase when a writhe is smoothed out as: 






n' 


where we have represented the link invariant by the affected portion of the link. Thus, in vertical 
framing, invariant for an unknot with self-linking (framing) number -fl or ~1 is related to the 
invariant for an unknot with zero self-linking number as: 








and Vj 




q [2j + 1]. 


In this framing, each right-(left-) handed crossing in a knot introduces a self-linking number +1 
(—1). For a right-handed trefoil (self-linking number = 3), the invariant in this framing turns out 
to be: 

Vj[T+]= Y. [2m + l] 

Three-manifold invariant is constructed from these link invariants in vertical framing. It has 

been shown that[5]: For a framed link [i,/] with component knots, Ki, K 2 , Kr and their 

framings respectively as n(l), n(2), n(r), the quantity 


F[L,f] = a •''J Y V[L-, n(l),n(2), ...n(r); (15) 

Hi) 

constructed from invariants V of the unoriented framed link in vertical framing, is an invariant of 
the associated three-manifold obtained by surgery on that link. Here the coefficients pi are given by 


Pi = Sqi , where 


Sji = 



sin 


7r(2;+l) (2^+1) 
Jk-h2 


and a = exp 37rik/[A{k -f 2)], and a[L, f] is the signature of the linking matrix W[L, /]: cr[L, /] = 
(no. of + ve eigenvalues of W) — (no. of ~ ve eigenvalues of W). The off diagonal elements 
of the linking matrix {W[L,f])ij are given by linking number lk{Ki,Kj) for the distinct knots 
{i ^ j) and diagonal elements {i = j) are the self-linking number (frame number) of the knot Kii 
(W[L, f])u = SC{Ki) = m. 

It can be directly verified that this three-manifold invariant (15) is unchanged under Kirby 
moves I and IL 


Explicit examples: Now computation of this invaxiant for various three-manifolds is rather 
straight forward. We present its value for a few three-manifold. The surgery descriptions of 
manifolds x and RP^ are given by an unknot with framing -hi, 0 and -h2 respectively. 

As indicated above the invariant for an unknot with zero framing carrying spin j representation is 
[2j -f 1] = Soj/Soo, where the square bracket represents the g-number. Thus the invariant for 5^ 
is: 


F[S^] = F 


+i 


o 


= a 


-1 


i=0,l/2Ay...k/2 


nOi 


SjQ 

Sqo 


where pi = Su and the factor is the effect from the framing -hi (one right-handed writhe). 
We make use of an identity: Yli ^tm = ol Sjm which is closely related to the 

modular transformations of a torus. Thus this invariant for is simply: 


F[S^] = 1 



226 


R K Kaul 


For the three-manifold x 5^ (with surgery representation as an unknot with zero framing) is: 


F[S^ xS^] = F 


o 


E Sio SoiSio 


Soo Soo 


where orthogonality property of the S matrix, J2e been used. 

Next for the three-dimensional real projective space RP^ (this is an 5^ with antipodal points 
identified), the invariant is: 


1 Sjo 

3 — 0 ) 5 , 1 ,. ^ 

A slightly more complex example we take up is the Poincare manifold (also known as 
dodecahedral space or Dehn’s homology sphere). It is a homology three-sphere given by the set 
of points (u,u, tt;) in complex 3-space such that -f H- = 0 and \u\^ + |up H- \w\^ = 1. Its 
surgery presentation is given [21] by a right-handed trefoil knot with framing -f 1: 


F[RP^ 


+2 



Notice, each right-handed crossing of the trefoil introduces -hi linking number between the knot 
and its vertical framing, and each of the two left-handed writhes contributes —1 so that the total 
frame number of this knot is -hi. Now using the knot invariant for trefoil in vertical framing given 
above, the invariant for this three-manifold can easily be written down: 

F[P^] = a-^ ^ ^ (-)"* [2m-M]9-3Ci+^ 

...I m=0,l,...min(2^,A;-2^) 

The two left-handed writhes introduce a factor of . 

The invariant F for a manifold constructed above is same, up to a normalization, as the 
partition function of an SU{2) Chern-Simons theory on that manifold[5, 11]: 


Z[M^] = F[M^] Soo- (16) 

Generally, it is rather difficult to obtain the Chern-Simons partition function for a given three- 
manifold directly. But, the formulae above, make its computation through F rather easy. 

The three-manifold invariant presented here is given in terms of link invariants from SU (2) 
Chern-Simons theory. It is clear that a similar construction can be done with link invariants from 
Chern-Simons gauge theories based on other semi-simple groups. This would yield a new method 
of obtaining the partition function of such Chern-Simons theories. 

Next question we may ask is: Is this three-manifold invariant complete? Two manifolds M 
and M' for which the invariants F[M] and F[M^] are different can not be homeomorphic to each 
other. But the converse is not always true; for two arbitrary manifold, the invariants need not be 
always different. Recall the invariants obtained from Chern-Simons theory for mutant knots are 
not distinct. Hence, manifold obtained by surgery on topologically inequivalent mutant knots can 
not be distinguished by this three-manifold invariant. 


6 Perturbative non- Abelian Chern-Simons theory 

Though Chern-Simons theories have been solved exactly and non-perturbatively as discussed above, 
perturbative analysis of these theories are also rich in their mathematical structure. If we expand 
the expectation value of the Wilson loop operator associated with a knot as a perturbative power 



Topological QFT 


227 


series in the coupling constant, the coefBcients of such an expansion have a deep mathematical 
meaning. These on their own are topological invariants characteristic of the knot. 

Last decade has also witnessed enormous research activity in direct perturbative calculations in 
Chern-Simons gauge field theory [26]. By simple power counting this theory is superrenormalizable. 
There are divergences, which need to be regularized. The effective coupling constant k does in 
general depend on the regularization. In a class of regularizations, a shift in the coupling constant 
takes place: k k + 2 fov SU (2) theory. This shift is consistent with the effective coupling in the 
non-perturbative studies of the theory. 

It is very easy to see that the first order contribution to the vacuum expectation value of the 
Wilson loop operator for a knot is the self-linking number of the knot up to some group theoretic 
factors. This is so because at this order, the theory reduces to essentially Abelian Chern-Simons 
theories. Topological regularization of the coincident loop integrals through framing as discussed in 
Sec.3 earlier, leads to this result. Higher order contributions to the expectation value of a Wilson 
loop operator in an 51/(2) Chern-Simons theory yield the famous Vassiliev invariants. These 
were first introduced by V.A. Vassiliev in 1990 from a totally different mathematical framework 
involving a study of the space of all smooth maps of 5^ into 5^. These maps have different types of 
singularities. According to the type of singularities, this space of the maps divides into classes, each 
of which corresponds to a knot type. These classes are characterized by the families of invariants 
characterizing the knot [2 7]. 

Perturbative studies of Chern-Simons theory have provided new insights into the theory of 
Vassiliev invariants. In a gauge theory, perturbative calculations are to be performed in a definite 
gauge. Calculations in the Landau gauge [28] lead to covariant integral representations of Vassiliev 
invariants, also known as configuration space integrals first developed by Bott and Taubes in 1994 
[29]. Another integral representation of the Vassiliev invariants was introduced by M. Kontsevich 
in 1993 [30]. This corresponds to perturbative calculation of the Chern-Simons theory in light-cone 
gauge [31]. It is rather very difficult to realize that these two integrals represent the same invariant. 
However, from a field theoretic point of view, this is simply a consequence of gauge invariance. 
Calculations in the temporal gauge have yielded yet another formulation of these invariants, leading 
to combinatorial formulae for them [32]. 

7 Gravity and Chern-Simons theory 

While Chern-Simons theories have provided a powerful framework for theory of knots, these field 
theories are also of direct relevance in physics. For example there is an intimate relationship 
between these field theories and three dimensional gravity which is also a topological field theory. 
In fact two copies of 51/(2) Chern-Simons theories represent gravity in Euclidean three-space with 
a negative cosmological constant[33]. To see this, just consider the partition function of two 51/(2) 
Chern-Simons theories recast in terms of an 5L(2, C) Chern-Simons theory as: 

Z - J [dAydA] exp { ^ J {A^dpAa + ^Af^ApAa) 

- tr {Afj,duAa + ^Af^ApAa) ] | 

where A is an the 5X(2, C) gauge field and A its conjugate. This partition function is square of 
two 5C/(2) partition functions: Zsl{2,C) = \Zsu{ 2 )?^ Make a change of variables A = a; -f ie/l 
and A = a; - ie/^, where u and e are the gravitational spin connection and triad respectively. 
Writing kS[A] = ^ / dPxe^^^" tr[A^9^Aa + |A^Ai,Aa], this then relates the action of these two 
Chern-Simons theories to Einstein-Hilbert action for three dimensional gravity: 

- 5W) = 1) (17) 

where the cosmological constant = -1/P is negative and the Chern-Simons coupling is related to 
the gravitational coupling as A: = i/{iG). 



228 


R K Kaul 


This is closely related to another development in gravity. Three-dimensional gravity has a 
lattice formulation, first introduced by and G. Ponzano and T. Regge in 1968 [34]. Here the 
three-manifold is decomposed into simplices. Each three-simplex is a tetrahedron. To each edge 
of th e tetrahe dron, a half-integral spin j, called its colour^ is assigned so that its length is given 
by y/j{j + 1)- The spins on the three edges of each triangular face satisfy the triangular angular 
momentum inequality relations. The gravitational partition function is constructed in terms of a 
Racah-Wigner six-j symbols for each tetrahedron in the simplicial decomposition of the manifold. 
For large spins, the six-j symbols reproduce the ordinary gravitational action. Ponzano-Regge 
partition function suffers from a problem: it diverges as all possible spin values are allowed to live 
on the edges. This, therefore requires a regularization. A slightly more complex generalization 
of this lattice gravity model, which also provides this regularization, is related to a model first 
introduced by V.G. Turaev and O.Y. Viro [35]. It replaces the ordinary 6-j symbols by their 
^“deformed analogues (with q as a root of unity). For large spin values, the g-six-j symbol can 
be shown to give Regge action for a tetrahedron and represents Euclidean gravity action with a 
negative cosmological constant. The Turaev-Viro model would then be a quantum description of 
this three dimensional gravity. 

For a triangulation of the three-manifold in terms of tetrahedra labeled by t and colouring jg 
of its edges labeled by e, Turaev-Viro partition function for a manifold without boundary is given 
by the formula: 


Ztv = 


E n X n (-i)"'- Pi. + 1] 


colourings jc<A;/2 vertices 
X 


edges e 




tetrahedra t 


(18) 


The the square bracket indicates the g-numbers, and curly brackets represent the g-6j symbol. 
The deformation parameter g is related to the Chern-Simons coupling by g = exp[27ri/{k -f 2)] and 
A = -2(fc -h 2)/(g^^^ - g”’^/2)2 = (5oo)’"^- This partition function is naturally regularized and 
finite due the restriction on the spins living on the edges (je < k/2) introduced by the fact that 
the deformation parameter is a root of unity. Further this partition function can be shown to be 
exactly square of an SU(2) Chern-Simons partition function, Ztv = This provides yet 

another representation for the Chern-Simons partition function. 

Notice that the integration measure in the partition function of two Chern-Simons theories 
above is [dA, dA], whereas for the gravity partition function, it is [de, da;]. Since A = a; H- ie/£ and 
A = uj — ief£, the relation between the two involves l/£ factors as the Jacobian. In fact in more 
exact treatment, it becomes clear that the Jacobian for this change of variables introduces exactly 
a factor of A for every vertex of the triangulation, so that the gravity partition function is just the 
Turaev-Viro partition function without the 1/A factors: 

Z^rav = E n 

colourings ja<k/2 edges e 

X JJ exp (-mEjiC*)) 

tetrahedra t i 


[Sje+l] 


hit) hit) jsit) 
kit) jsit) jeit) 


(19) 


For a manifold with boundary, this expression has additional factors of exp (iTrjj,) + 1] for 

every boundary edge with a spin j^. This partition function then is a functional of the boundary 
triangulation and spins of edges on the boundary. 

There are many interesting questions which can be addressed in this framework for three- 
dimensional gravity. Some of these are: how does a black hole look in this formulation? What is 
its entropy? Analysis shows that a black hole (Banados-Teitelboim-Zanelli black hole) is given by 
a solid torus. Its horizon is given by the longitudinal circle at the core of this solid torus. The 
possible states associated with this black hole are the states associated with different triangulations 
of the black hole manifold, with the restriction that the longitudes have same circumference. It can 



Topological QFT 


229 


be shown that correct semi-classical behaviour of entropy is reproduced by states corresponding to 
all possible triangulations of such an Euclidean black hole [36]. The dominant contribution comes 
from the states at the horizon. 

Chern-Simons theories have also played an important role in non-perturbative formulation 
of canonical quantum gravity in four dimensions [37]. In this approach, the physical states are 
given by spin-networks with associated graphs in three-space, where edges are labeled by SU {2) 
spins (colours) and vertices are given by interwinning operators. Quantum mechanical operators 
corresponding to lengths, areas and volumes all have discrete spectrum. It can be argued that the 
boundary degrees of freedom of a black hole, say Schwarzschild black hole, in four dimensional 
gravity can be described by a Chern-Simons theory[38, 39]. The action embodying the appropriate 
boundary conditions on the black hole horizon consists of, in addition to the Einstein-Hilbert 
action (in suitable variables), an SU (2) Chern-Simons gauge theory living on a coordinate chart of 
a constant finite cross-sectional area on the horizon. The Chern-Simons coupling k is proportional 
to this constant cross-sectional area. As the fundamental quantum excitations are polymer like, 
the horizon area is generated by the punctures where these spin-polymers pierce it. A bulk polymer 
state that gives the horizon its area in this manner has to be compatible with the surface states on 
the horizon itself. These boundary states are described by a quantum SU(2) Chern-Simons theory 
on the horizon. That is, the space of these boundary degrees of freedom is given by the space 
of states of Chern-Simons theory on a three-manifold with an boundary with finitely many 
punctures on which spins live. The entropy of the black hole emerges from these boundary states. 
For large areas, where essentially U{1) subgroub of SU{2) contributes, the entropy is calculated 
by counting these states. Their number grows exponentially with horizon area yielding the semi- 
classical Bekenstein-Hawking expression for black hole entropy [39]. For finite areas, full SU{2) 
counting has to be done. This has been done by exploiting the relation between the boundary 
states of the Chern-Simons theory and the space of conformal blocks of the associated Wess- 
Zumino conformal field theory on the boundary 2-sphere, a relationship which played a crucial role 
in obtaining the link invariants in Sec.4. This yields an ea;aci formula for entropy of a non-rotating 
black hole which for large areas reproduces the semi-classical formula, but for finite areas goes 
beyond the Bekenstein-Hawking result [40]. 


8 Summary and Concluding remarks 

We have attempted here to indicate how quantum field theories, which have been successfully used 
to describe physics of fundamental interactions of Nature, can also be used to study geometry and 
topology of low dimensional manifolds. These developments not only provide new insights into old 
problems of topology of these manifolds but also have been responsible for profoundly interesting 
new mathematical results. These developments have make use of many of the recent developments 
in quantum field theories. The interaction between quantum physics and mathematics has enriched 
both. 

Chern-Simons gauge field theory, a topological quantum field theory, provides a powerful frame- 
work for modern theory of knots and links in any three-manifold. This is one of the rare quantum 
field theories which can be explicitly and non-perturbatively solved. While Abelian Chern-Simons 
theory provides a simple description of linking and self-linking numbers of a link, non- Abelian the- 
ories are even richer. For every representation of any non- Abelian gauge group, there is a new link 
invariant. Jones polynomial associated with spin 1/2 representation in an 517(2) Chern-Simons 
theory, is the simplest example of such link invariants. Even more general invariants {coloured 
invariants) are obtained if we place different representations on the component knots. The frame- 
work is rich enough to discuss the knots and links not only in simple manifold like or 5^, 
but any arbitrary three-manifold. Chern-Simons partition function is a particularly interesting 
three-manifold invariant for which a simple and efficient computational method is available now. 
Perturbative studies of Chern-Simons theory have given a new framework for describing Vassiliev 
invariants. 

In the process of developing this framework for knot theory, new representations of braids also 



230 


R K Kaul 


have been obtained. The close connection that braids have with Yang-Baxter equation, has pro- 
vided methods of obtaining a variety of new exactly solvable two-dimensional statistical mechanical 
models in physics [5]. These models are the higher vertex generalization of the six- vertex model of 
Lieb and Wu and 19- vertex model of Zamolodchikov and Fateev. 

Chern-Simons field theories are also of direct interest in other areas of physics. One area 
where these have found profound application is quantum gravity. Three-dimensional gravity with 
a negative cosmological constant, itself a topological field theory, is essentially described by two 
SU (2) Chern-Simons theories. Micro-states of a black hole in the four dimensional spin-polymer 
gravity can also be modeled by a Chern-Simons theory. This allows an exact computation of black 
hole entropy going beyond the semi-classical result. These calculations so far have been done for 
non-rotating black holes only. These need to be extended for charged and rotating black holes, 
which requires certain amount of technical work. Further, while an exact formula for quantum 
entropy of a non-rotating black hole has been derived, a similar exact formula for the expectation 
value of the area operator in the Chern-Simons approach is not known. Also, a satisfactory 
understanding of Hawking radiation in this picture is yet to be developed. 

String theory is another interesting framework in which black hole entropy has been analyzed in 
recent times. Though it provides a fundamental quantum description, unfortunately, calculations 
in this theory can be done for extremal or near extremal black holes only. These despite their 
mathematical interest are not astrophysically realistic. In particular, black holes of interest such 
as a Schwarzschild black hole are not generally amenable to analysis in this approach. Also super- 
symmetry plays an important role in the string picture. In contrast, modeling of micro-states of a 
black hole by an effective Chern-Simons theory is not limited by the constraint of extremality or 
near extremality. This framework handles the curved geometry of the black hole directly without 
invoking supersymmetry. 

There are other topological quantum field theories also. One particularly interesting class 
is so called cohomological field theories. These are the field theoretical interpretations of four- 
manifold invariants obtained by S. Donaldson in 1983. His work is an example of developments 
in mathematics which have made critical use of some of the notions of physics [41]. His theory 
provides an understanding of the geometry in four dimensions through self-dual and anti-self-dual 
Yang-Mills gauge fields known to physicists as ‘instantons and anti-instantons’. Five years later, 
E. Witten provided a quantum field theoretical framework for Donaldsons’s work in terms of a 
four dimensional topological Yang-Mills gauge field theory[42]. This field theory has certain kind 
of twisted supersymmetry. Donaldson invariants are given as the correlation functions in this 
field theory. In recent years, this area has registered even further boost through the work of 
Seiberg and Witten [43]. These developments use the powerful electric-magnetic duality to relate 
the cohomological field theory based on gauge group SU{2) to that based on U{1). This brings in 
completely new- insights into this area and makes calculation of Donaldson four manifold invariants 
rather easy. 


References 

[1] M. Atiyah: The Geometry and Physics of Knots ^ Cambridge Univ. Press (1989). 

[2] E. Witten: Commun. Math. Phys. 121 (1989) 351-399. 

[3] A.S. Schwarz: New topological invariants in the theory of quantized fields, Baku International 
Conference (1987). 

[4] R.K. Kaul: Complete solution of 517(2) Chern-Simons theory, hep-th/9212129; 
and Commun. Math. Phys. 162 (1994) 289 (hep-th/930532). 

[5] R.K. Kaul, Chern-Simons theory, knot invariants, vertex models and three-manifold invariants, 
hep-th/9804122, in Frontiers of Field Theory, Quantum Gravity and Strings (Volume 227 in 
Horizons in World Physics), eds. R.K. Kaul et al, NOVA Science Publishers, New York (1999). 



Topological QFT 


231 


[6] V.F.R. Jones: Bull. AMS (1985) 103-112; Ann. Math. 126 (1987) 335-388. 

[7] M. Wadati, T. Deguchi and Y. Akutsu: Phys. Rep. 180 (1989) 247 and references therein; 

M. Jimbo: Coinmun. Math. Phys. 102 (1986) 537; 

V. G. Turaev: Inv. Math. 92 (1988) 527. 

[8] A.N. Kirillov and N.Yu. Reshetikhin: Representation algebra Uq{SL{2)), g-orthogonal polyno- 
mials and invariants of links, in New Developments in the Theory of Knots, ed. T. Kohno, World 
Scientific, Singapore (1989) . 

L. Alvarez Gaume, G. Gomez and G. Sierra: Phys. Letts. B220 (1989) 142-152. 

C. Kassel, M. Rosso and V. Turaev: Quantum groups and knot invariants, Panoramas et syn- 
theses 5, Societe Mathematique de France (1997). 

[9] N.Y. Reshtekhin and V. Turaev: Invent. Math. 103 (1991) 547. 

R. Kirby and P. Melvin: Invent. Math. 105 (1991) 473. 

[10] W.B.R. Lickorish: Math. Ann. 290 (1991) 657; and Pac. J. Math. 149 (1991) 337. 

[11] P. Ramadevi and Swatee Naik, Computation of Lickorish’s three-manifold invariants using 
Chern-Simons theory, hep-th/9901061. 

[12] C. N. Little: Trans. R. Soc. Eddinburgh 39 (1990) 771; 

K.A. Perko: Proc. Am. Math. Soc. 45 (1974) 262. 

[13] J.W. Alexander: Trans. Am. Math. Soc. 30 (1928) 275. 

[14] P. Freyd, D. Yetter, J. Hoste, W.B.R. Lickorish, K. Millet and A. Ocneanu: Bull. AMS. 12 
(1985) 239; 

J.H. Przytycki and K.P. Traczyk: Kobe J. Math. 4 (1987) 115. 

[15] William Thomas Kelvin, First Baron, Mathematical and Physical Papers, vol. IV, Hydrody- 
namics and General Dynamics, Cambridge University Press, Cambridge (1910). 

[16] R.K. Kaul and R. Rajaraman: Phys. Letts. B249 (1990) 433-437. 

[17] G. Calugareanu: Rev. Math. Press. App. 4 (1959) 5 and Czech. Math. Jour. 11 (1961) 5881: 

W. F. Pohl: J. Math. Mech. 17 (1968) 975. 

[18] F.H. C. Crick: Proc. Natl. Acad. Sciences, USA, 73 (1971) 2639; 

F. Kamenetskii and A.V. Vologodskii: Sov. Phys. Usp. 24 (8) (1981) 679. 

[19] J.W. Alexander: Proc. Natl. Acad. 9 (1923) 93. 

[20] J.S. Birman: Braids, Links and Mapping Class groups. Annals of Mathematics Studies Prince- 
ton Univ. Press (1975). 

[21] D. Rolfsen: Knots and links. Publish or Perish, Berkeley (1976). 

[22] P. Ramadevi, T.R. Govindarajan and R.K. Kaul: Mod. Phys. Letts. A9 (1994) 3205. 

[23] P. RamaAevi, T.R. Govindarajan and R.K. Kaul: Mod. Phys. Letts. AlO (1995) 1635. 

[24] A.D. Wallace: Can. J. Math. 12 (1960) 503; B.R. Lickorish: Ann. of Math. 76 (1962) 531. 

[25] R. Kirby: Invent. Math. 45 (1978) 35; 

R. Feim and C. Rourke: 18 (1979) 1. 

[26] For a recent review see, J.M.F. Labastida: Chem-Simons gauge theory: ten years after, heq 
th/9905057. 

[27] V.A. Vassiliev: Cohomology of knot spaces. Theory of singularities and its applications, Ad 
vances in Soviet Mathematics, vol. 1, American Math. Soc., Providence, RI (1990) 23-69. 



232 


R K Kaul 


[28] E.Guadagini, M. Martellini and M. Mintchev: Phys. Letts B227 (1989) 111 and NucL Phys. 
B330 (1990) 575; 

M. Alvarez and J.M.F. Labastida: Nucl. Phys. 395 (1993) 198 and NucL Phys. B433 (1995) 
555; 

D. Althschuler and L. Priedel: Commun. Math. Phys. 187 (1997) 261 and 170 (1995) 41. 

[29] R. Bott and C. Taubes: Jour. Math. Phys. 35 (1994) 5247. 

[30] M Kontsevich: Advances in Soviet Math. 16, Part 2 (1993) 137. 

[31] A.S. Cattaneo, P. Cotta-Ramusino, J. Prohlich and M. Martellini: J. Math. Phys. 36 (1995) 
6137; 

J.M.F. Labastida and E. Perez: J. Math. Phys. 39 (1998) 5183. 

[32] J.M.F. Labastida and E. Perez: Combinatorial formulae for Vassiliev invariants from Chern- 
Simons gauge theory, hep-th/9807155 and Vassiliev invariants in the context of Chern-Simons 
gauge theory, hep-th/9812105. 

[33] E. Witten: Nucl. Phys. B311 (1989) 46. 

[34] G. Ponzano and T. Regge in Spectroscopic and group theoretical methods in physics, ed. F. 
Block, North-Holland, Amsterdam 1968. 

[35] V.G. Turaev and O.Y. Viro: Topology 31 (1992) 865; 

H. Ooguri: Nucl. Phys. B382 (1992) 865. 

[36] V. Suneeta, R.K. Kaul and T.R. Govindarajan: gr-qc/9811071, Mod. Phys. Letts. A14 (1999) 
349. 

[37] A, Ashtekar: Phys. Rev. Lett. 57 (1986) 2244; 

C. Rovelli and L. Smolin: Nucl. Phys. B331 (1990) 80; 

A. Ashtekar and J. Lewandowski: in Knots and quantum gravity, ed. J. Baez, Oxford Univ. 
Press, 1994; 

J. Baez: Lett. Math. Phys. 31 (1994) 213; 

C. Rovelli and L. Smolin: Phys. Rev. D52 (1995) 5743. 

[38] L. Smolin: J. Math. Phys. 36 (1995) 6417; 

A. Balachandran, L. Chandar and A. Momen: Nucl. Phys. B461 (1996) 581; 

C. Rovelli: Phys. Rev. Lett. 77 (1996) 3288. 

[39] A. Ashtekar, J. Baez, A. Corichi, K. Krasnov: Phys. Rev. Lett. 80 (1998) 904. 

[40] R.K. Kaul and P. Majumdar: Phys. Letts. B439 (1998) 267; 

P. Majumdar: Indian Jour, of Phys. 73B (1999) 147. 

[41] S.K. Donaldson: J. Diff. Geom. 18 (1983) 269 and Polynomial invariants for smooth four- 
manifolds, Topology 29 (1990) 257. 

[42] E. Witten: Commun. Math. Phys. 117 (1988) 353. 

[43] N. Seiberg and E. Witten: Nucl. Phys. B426 (1994) 19. 



This article has been reproduced from 
Commun. Math. Phys. 121,351-399 (1989) 
with kind permission from Springer- Verlag 


Quantum Field Theory and the Jones Polynomial * 

Edward Witten ** 

School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, 

NJ 08540, USA 


Abstract. It is shown that 2 + 1 dimensional quantum Yang-Mills theory, with 
an action consisting purely of the Chem-Simons term, is exactly soluble and 
gives a natural framework for understanding the Jones polynomial of knot 
theory in three dimensional terms. In this version, the Jones polynomial can be 
generalized from to arbitrary three manifolds, giving invariants of three 
manifolds that are computable from a surgery presentation. These results shed 
a surprising new light on conformal field theory in 1 + 1 dimensions. 

In a lecture at the Hermann Weyl Symposium last year [1], Michael Atiyah 
proposed two problems for quantum field theorists. The first problem was to give 
a physical interpretation to Donaldson theory. The second problem was to find an 
intrinsically three dimensional definition of the Jones polynomial of knot theory. 
These two problems might roughly be described as follows. 

Donaldson theory is a key to understanding geometry in four dimensions. 
Four is the physical dimension at least macroscopically, so one may take a slight 
liberty and say that Donaldson theory is a key to understanding the geometry of 
space-time. Geometers have long known that (via de Rham theory) the self-dual 
and anti-self-dual Maxwell equations are related to natural topological invariants 
of a four manifold, namely the second homology group and its intersection form. 
For a simply connected four manifold, these are essentially the only classical 
invariants, but they leave many basic questions out of reach. Donaldson’s great 
insight [2] was to realize that moduli spaces of solutions of the self-dual Yang- 
Mills equations can be powerful tools for addressing these questions. 

Donaldson theory has always been an intrinsically four dimensional theory, 
and it has always been clear that it was' connected with mathematical physics at 
least at the level of classical nonlinear equations. The puzzle about Donaldson 
theory was whether this theory was tied to more central ideas in physics, whether it 
could be interpreted in terms of quantum field theory. The most important 


* An expanded version of a lecture at the lAMP Congress, Swansea, July, 1988 

** Research supported in part by NSF Grant No. 86-20266, and NSF Waterman Grant 88-17521 



Fig. 1. A knot in three dimensional space 


evidence for the existence of such a connection had to do with Floer’s work on 
three manifolds [3] and the nature of the relation between Donaldson theory and 
Floer theory. Also, the “Donaldson polynomials” had an interesting formal 
analogy with quantum field theory correlation functions. It has turned out that 
Donaldson theory can indeed be given a physical interpretation [4], 

As for the Jones polynomial and its generalizations [5-1 1], these deal with the 
mysteries of knots in three dimensional space (Fig. 1). The puzzle on the 
mathematical side was that these objects are invariants of a three dimensional 
situation, but one did not have an intrinsically three dimensional definition. There 
were many elegant definitions of the knot polynomials, but they all involved 
looking in some way at a two dimensional projection or slicing of the knot, giving 
a two dimensional algorithm for computation, and proving that the result is 
independent of the chosen projection. This is analogous to studying a physical 
theory that is in fact relativistic but in which one does not know of a manifestly 
relativistic formulation - like quantum electrodynamics in the 1930’s. 

On the physical side, the puzzle about the knot polynomials was the following. 
Unlike the Donaldson theory, where a connection with quantum field theory was 
not obvious, the knot polynomials have been intimately connected almost from 
the beginning with two dimensional many body physics. In fact, constructions of 
the knot polynomials have related them to two dimensional (or 1 + 1 dimensional) 
many-body physics in a bewildering variety of ways, mainly involving soluble 
lattice models [7], solutions of the Yang-Baxter equation [8], and monodromies of 
conformal field theory [11]. In the latter interpretation, the knot polynomials are 
related to aspects of conformal field theory that have been particularly fruitful 
recently [12-16]. On the statistical mechanical side, studies of the knot poly- 
nomials have related them to Temperley-Lieb algebras and their generalizations, 
and to other aspects of soluble statistical mechanics models in H- 1 dimensions. 
For physicists the challenge of the knot polynomials has been to bring order to this 
diversity, find the unifying themes, and learn what it is that is three dimensional 
about two dimensional conformal field theory. 

Now, the Donaldson and Jones (and Floer and Gromov [17]) theories deal 
with topological invariants, and understanding these theories as quantum field 
theories involves constructing theories in which all of the observables are 
topological invariants. Some physicists might consider this to be a little bit 
strange, so let us pause to explain the physical meaning of “topological 
invariance”. The physical meaning is really “general covariance”. Something that 
can be computed from a manifold M as a topological space (perhaps with a 



Quantum Field Theory and the Jones Polynomial 


235 


smooin structure) without a choice of metric is called a “topological invariant” (or 
a “smooth invariant”) by mathematicians. To a physicist, a quantum field theory 
defined on a manifold M without any a priori choice of a metric on M is said to be 
generally covariant. Obviously, any quantity computed in a generally covariant 
quantum field theory will be a topological invariant. Conversely, a quantum field 
theory in which all observables are topological invariants can naturally be seen as 
a generally covariant quantum field theory. Indeed, the Donaldson, Floer, Jones, 
and Gromov theories can be seen as generally covariant quantum field theories in 
four, three, and two space-time dimensions. The surprise, for physicists, perhaps 
comes in how general covariance is achieved. General relativity gives us a 
prototype for how to construct a quantum field theory with no a priori choice of 
metric - we introduce a metric, and then integrate over all metrics. This example is 
so influential in our thinking that we tend to think of a generally covariant theory 
as being, by definition, a theory in which the metric is a dynamical variable. The 
lesson from the Donaldson, Floer, Jones, and Gromov theories is precisely that 
there are highly non-trivial quantum field theories in which general covariance is 
realized in other ways. In particular, in this paper we will describe an exactly 
soluble generally covariant quantum field theory in which general covariance is 
achieved not by integrating over metrics but because we begin with a gauge 
invariant Lagrangian that does not contain a metric. 


1. The Chern-Simons Action 

We have been urged [1] to try to interpret the Jones polynomial in terms of three 
dimensional Yang-Mills theory. So we begin on an oriented three manifold M 
with a compact simple gauge group G. We pick a G bundle E, which may as well be 
trivial, and on E we place a connection Af , which can be viewed as a Lie algebra 
valued one form {a runs over a basis of the Lie algebra, and i is tangent to M). An 
infinitesimal gauge transformation is 

Ai —*■ Ai — T)j£, (l-l) 

where s, a generator of the gauge group, is a Lie algebra valued zero form and the 
CO variant derivative is D^s = 3(8 + [A;, e]. The curvature is the Lie algebra valued 
two form Fij = [Di,Dj] = diAj-djAi + [Ai,Ajl Now we need to choose a 
Lagrangian. We will not pick the standard Yang-Mills action ^ 

( 1 - 2 ) 

M 

as this depends on the choice of a metric gy. We want to formulate a generally 
covariant theory (in which all observables will be topological invariants), and to 
this aim we want to pick a Lagrangian which does not require any choice of metric. 


' In what follows, the symbol “Tr” denotes an invariant bilinear form on the Lie algebra of G, a 
multiple of the Cartan-Killing form; we will specify the normalization presently 



236 


E Witten 


Precisely in three dimensions there is a reasonable choice, namely the integral of 
the Chern-Simons three form: 

if = ^ A c/A+fA A A A A) 

= ^ + (1.3) 

8;r M 

The Chern-Simons term in three dimensional gauge theory has a relatively long 
history. The abelian gauge theory with only a Chern-Simons term was studied by 
Schwarz [18] and in unpublished work by I. Singer. Three dimensional gauge 
theories with the Chern-Simons term added to the usual action (1.2) were 
introduced in [19-21]. The nonabelian theory with only Chern-Simons action was 
studied classically by Zuckerman [22]. The abelian Chern-Simons theory has 
recently been studied in relation to fractional statistics by Hagen [24] and by 
Arovas et al. [25] and in relation to linking numbers by Polyakov [23] and Froh- 
lich [1 5]. The novelty in our present discussion is that we will consider the quantum 
field theory defined by the nonabelian Chern-Simons action and argue that it is 
exactly soluble and has important implications for three dimensional geometry 
and two dimensional conformal field theory. 

The first fundamental property of the Chern-Simons theory is the quanti- 
zation law first discussed in [21]. It arises because the group G of continuous maps 
M-*G is not connected. In the homotopy classification of such maps one meets at 
least the fact that 7t;^(G) Z for every compact simple group G. Though (1.3) is 
invariant under the component of the gauge group that contains the identity, it is 
not invariant under gauge transformations of non-zero “winding number”, gauge 
transformations associated with non-zero elements of n^iG). Under a gauge 
transformation of winding number m, the transformation law of (1.3) is 

if .5f -I- const • m . (1.4) 

As in Dirac’s famous work on magnetic monopoles, consistency of quantum field 
theory does not quite require the single- valuedness of if, but only of exp (/ if ). For 
this purpose, it is necessary and sufficient that the “constant” in (1 .4) should be an 
integral multiple of 2n. This gives a quantization condition on the parameter 
called k in (1.3). If G is SU{N) and “Tr” means a trace in the N dimensional 
representation, then the requirement is that k should be an integer. In general, for 
any G, we can uniquely fix the so far unspecified normalization of “Tr” so that the 
quantization condition is k 6 Z. 

We will see later that k is very closely related to the central charge in the theory 
of highest weight representations of affine Lie algebras. It is no accident that the 
reasoning which shows that k must be quantized in (1 .3) has a 1 -f 1 dimensional 
analogue [26] which leads to quantization of the central charge in the represent- 
ation theory of affine algebras. 

In quantum field theory, in addition to a Lagfangian, one also wishes to pick a 
suitable class of gauge invariant observables. In the present context, the usual 
gauge invariant local operators would not be appropriate, as they spoil general 
covariance. However, the “Wilson lines” so familiar in QCD give a natural class of 



Quantum Field Theory and the Jones Polynomial 


237 



Fig. 2. Several linked but non-intersecting oriented knots in a three manifold M. Such a 
collection of knots is called a “link” 

gauge invariant observables that do not require a choice of metric. Let C be an 
oriented closed curve in M. Intrinsically C is simply a circle, but the topological 
classification of embeddings of a circle in M is very complicated, as we observe in 
Fig. 1 . Let R be an irreducible representation of G. One then defines the “Wilson 
line” Wg^ (C) to be the following functional of the connection A^. One computes 
the holonomy of around C, getting an element of G that is well-defined up to 
conjugacy, and then one takes the trace of this element in the representation R. 
Thus, the definition is 

= Tr^^ exp j JxL (1.5) 

c 

The crucial property of this definition is that there is no need to introduce a metric, 
so general covariance is maintained. 

We now can formulate the general problem of interest. In an oriented three 
manifold M, we take r oriented and non-intersecting knots Q, / = 1 . . . r, whose 
union is what knot theorists would call a “link” L. We assign a representation 
to each Q, and we propose to calculate the Feynman path integral 

Ji).3Cexp(iiP) n (1.6) 

i = l 

The symbol Ds/ represents Feynman’s integral over all gauge orbits, that is, an 
integral over all equivalence classes of connections modulo gauge transfor- 
mations. Of course, (1.6) has exactly the formal structure of some familiar 
observables in QCD, the difference being that we are in three dimensions instead 
of four and we have chosen a somewhat exotic gauge theory action. We will call 
(1.6) the “partition function” of M with the given link, or the (unnormalized) 
“expectation value” of the given link; we will denote it as Z (M; Ci, Ri) or simply 
as Z (M; L) for short. 

For the case of links in S^, we will claim that the invariants (1.6) are exactly 
those that appear in the Jones theory and its generalizations. Simply replacing 
with a general oriented three manifold M gives a very intriguing (and as we will 
see, effectively computable) generalization of the known knot polynomials. 
Taking r = 0 (no knots), (1.6) gives invariants of the oriented three manifold M 
which also turn out to be effectively computable. Before getting into any details, 
let us note a few preliminary indications of a possible connection between (1 .6) and 
the Jones theory: 

(1) In (1.6) we see the right variables, namely a compact Lie group G, a choice 
of representation Ri for each component Q of the link L, and an additional 



238 


E Witten 


variable k. [In knot theory one usually makes an analytic continuation and 
replaces kby a complex variable q, but it has been known since Jones’ original 
work that there are special properties at special values of q. We claim that these 
properties reflect the fact that the three dimensional gauge theory with action (1 .3) 
is well-defined only if k is an integer.] The two variable generalization of the Jones 
polynomial corresponds to the case that G isSU {N), and the are all the defining 
N dimensional representation of SU{N). The two variables are N and k, 
analytically continued to complex values. The Kauffman polynomial similarly 
arises for G = SO (N) and R the N dimensional representation. 

(2) As a further check on the plausibility of a relation between (1.6) and the 
knot polynomials, let us note first of all that (1.6) depends on a choice of the 
orientation of M, as this enters in fixing the sign of the Chern-Simons form. 
Likewise, (1.6) depends on the orientations of the Q, since these enter in defining 
the Wilson lines (in computing the holonomy around Q, one must decide in which 
direction to integrate around Q). If, however, one reverses the orientation of one 
of the Q and simultaneously exchanges the representation Ri with its complex 
conjugate Ri, then the definition of the Wilson lines is unchanged, so (1.6) is 
invariant under this process. And if (without changing the i?,) one reverses the 
orientations of all components C; of the link L, then (1 .6) is unchanged because of 
a symmetry that physicists would call “charge conjugation”. This is an involution 
of the Lie algebra of G that exchanges all representations with their complex 
conjugates; applying this involution to all integration variables in (1 .6) leaves (1 .6) 
invariant while exchanging all R^ with their conjugates or equivalently reversing 
the orientation of all the Cj. These are important formal properties of the knot 
polynomials. 


2. The Weak Coupling Limit 

To begin with, since a non-abelian gauge theory with only a Chern-Simons action 
may seem unfamiliar, one might ask whether this Lagrangian really does lead to a 
sensible quantum theory, and really can be regulated to give topologically 
invariant results. In this section, we will briefly investigate this point by studying 
the theory in a weak coupling limit in which computations are comparatively 
straightforward. This is the limit of large k.^ For large k, the path integral 

Z= j Dj^exp ^ Tr A dA + jA A A A aJJ (2.1) 


(for the moment we omit knots) contains an integrand which is wildly oscillatory. 
The large k limit of such an integral is given by a sum of contributions from the 
points of stationary phase. The stationary points of the Chern-Simons action are 
precisely the “flat connections”, that is, the gauge fields for which the curvature 
vanishes 



( 2 . 2 ) 


^ The reader may wish to bear in mind that the discussion in this section and the next contains a 
number of technicalities which are part of the logical story but perhaps not essential on a first 
reading 



Quantum Field Theory and the Jones Polynomial 


239 


Gauge equivalence classes of such flat connections correspond to 
homomorphisms 

0: (2.3) 

or more exactly to equivalence classes of such homomorphisms, up to conjug- 
ation. If for simplicity we suppose that the topology of M is such that there are 
only finitely many classes of homomorphisms (2.3), then the large k behavior of 
(2.1) will be a sum 

Z = (2.4) 

CL 

where the are a complete set of gauge equivalence classes of flat connections, 
and /z(^*“*) is to be obtained by stationary phase evaluation of (2.1), expanding 
around A^^K This reduction to a stationary phase evaluation means that the 
nonabelian theory, for large k, is closely related to the abelian theory. This in turn 
has been shown [18] to lead to Ray-Singer analytic torsion [27], which is closely 
related to the purely topological Reidemeister torsion. The /x(A^‘^^) may be 
evaluated as follows. We make in (2.1) the change of variables Ai = Aj“^ + Bi, 
where is the new integration variable. An important invariant of the flat 
connection is its Chern-Simons invariant 


l(A(^))=J- j Tr(^<“) A A A ^<“>). (2.5) 

M 

When the Chern-Simons action is expanded in powers of B ^ , the first terms are 




I Tr(B A DB) . 

M 


( 2 . 6 ) 


Here it is understood that in (2.6), the expression DB denotes the covariant 
exterior derivative of B with respect to the background gauge field A^"^; it does not 
depend on a metric on M. A salient point is that in (2.6) there is no term linear in B, 
since A^“^ is a critical point of the action. 

To carry out the Gaussian integral in (2.6), gauge fixing is needed. There is no 
way to carry out this gauge fixing without picking a metric on M (or in some other 
way breaking the symmetry of the problem). After picking such a metric, a 
convenient gauge choice is D; 5' = 0 (with D^ the co variant derivative constructed 
from the metric and the background gauge field A(^“^). The standard Faddeev- 
Popov construction then gives rise to a gauge fixing Lagrangian 

^gauge= J (Tr(/>D,B‘ + TrcDiD‘c). (2.7) 

M 

Here ^ is a Lagrangian multiplier that enforces the gauge condition DiB^ = 0, and 
c, c are anticommuting “ghosts” that are introduced to get the right measure on 
the space of gauge fields modulo gauge transformations. The quadratic terms in (p 
and B that can be found in (2.6), (2.7) have a natural geometric interpretation, 
described (in the abelian case) in [18]. Let D be the exterior derivative on M, 
twisted by the flat connection A^°‘\ and let * be the Hodge operator that maps k 
forms to 3 — k forms. On a three manifold one has a natural self-adjoint operator 
L = *D + D* which maps differential forms of even order to forms of even order 



240 


E Witten 


and forms of odd order to forms of odd order. Let L_ denote its restriction to 
forms of odd order. With B and 0 regarded as a one form and a three form, 
respectively, the boson kinetic operator in (2.6), (2.7) is precisely this operator L_ . 
The kinetic operator of the ghosts is also a natural geometrical operator, the 
Laplacian, which we will call A. We can now give a formula for the stationary 
point contributions that appear in (2.4). This is 

M (^<“0 = exp (zA:/(^(“>)) • - ^= ■ (2.8) 

]/det(L_) 

The phase factor in (2.8) is the value of the integrand in (2.1) at the point of 
stationary phase, and the determinants (whose absolute values can be defined by 
zeta functions) result from the Gaussian integral over B, 0, c, and c. 

Now we come to the crucial point. To regularize the path integral, we have had 
to pick a Riemannian metric on M. Therefore, it is not obvious a priori that the 
^(^<“>) computed this way will really be topological invariants. Perhaps the 
Chem-Simons theory suffers from anomalies, and cannot be regularized in a 
generally covariant fashion. Happily, we can now appeal to [1 8], where it was 
shown (in the context of the abelian theory, but this aspect of [1 8] generalizes) that 
the absolute value of the ratio of determinants appearing in (2.8) is precisely the 
Ray-Singer analytic torsion of the flat connection and so in particular is a 
topological invariant. (The phase of this ratio of determinants is more delicate, 
and will be discussed later.) This is the first indication that topological invariants 
really can be obtained from the Chern-Simons theory. 


The Phase of the Determinant 

Though the absolute value of the ratio of determinants in (2.8) is the analytic 
torsion discussed long ago by Schwarz, the phase requires additional study. The 
ghost determinant det A is real and positive, so the real issue is to study the phase 
of det L _ . Because the operator L _ can be interpreted as a twisted Dirac operator, 
the phase of its determinant can be related to the study of the phase of odd 
dimensional fermion determinants, as studied by various authors [28]. However, I 
will here give a brief derivation of the relevant facts from the bosonic point of view, 
which is perhaps more natural in the present context. After an irrelevant rescaling 
of B and 0, the integral of interest is 

J DBD 0 exp (^i j Tt{B a DB+ (pD* B)^ . (2.9) 


Upon changing variables to an orthonormal basis of eigenfunctions X; of the 
operator L_ , with eigenvalues 2;, (2.9) becomes 


°° dx. 

n j ^ 

i -00 yn 


yiXiXi 


( 2 . 10 ) 


Therefore the crucial integral to understand is 

“ dx 


-oo yn 


iXx^ 


( 2 . 11 ) 



Quantum Field Theory and the Jones Polynomial 


241 


for real A. We consider this integral to be defined by taking the limit as £ -> 0 of the 
absolutely convergent integral 


“ r]x 

-00 ]/n 


( 2 . 12 ) 


With this or any other physically reasonable definition, the integral (2.11) is 



• exp 



(2.13) 


The phase of the path integral is thus proportional to ^ sign A;, or better, to its 
regularized version which is the “eta invariant” of Atiyah et al. [29]: 


n (^^“^) = 9 lim X sign A,- 1 A,- 1 ^ 

2 s-»0 i 

Thus, the phase of the path integral may be expressed in the formula 


ietL_ 


expfy/7(^<“^)j. 


(2.14) 


(2.15) 


This can be made more explicit by using the Atiyah-Patodi-Singer theorem, which 
for our purposes can be Regarded as a formula that expresses the dependence of rj 
on the flat connection about which we are expanding. In fact, in the case of the 
operator L_ , the formula is 




(2.16) 


Here is the Chern-Simons invariant of the flat connection A^'‘\ as defined in 
(2.5), f] (0) is the eta invariant of the trivial gauge field A =0, and C 2 (G) is the value 
of the quadratic Casimir operator of the group G in the adjoint representation, 
normalized so that C 2 (SU (N)) = 2N. The effect of this factor is to replace k in 
(2.8) by A: + C 2 (G)/ 2 ; in fact, the partition function (2.4) may now be written 


2- = e‘”'l(0)/2 . ^^i(fc + C2(G)/2)/(^(=0) . jr 


(2.17) 


with 7^ [the absolute value of the ratio of determinants in ( 2 . 8 )] being the torsion 
invariant of the flat connection A^’^\ 

Unfortunately, although /(A^^^) and are topological invariants, 7 (0) is not; 
it depends on the choice of a metric on Min gauge fixing. Thus, to make sense of 
the phase of (2.17) requires further discussion, in the next subsection. 

Before launching into that technical discussion, let us iipte that the computa- 
tion just sketched actually has a very interesting spin-off. The fact that k in (2.8) 
has been replaced by k-\-C2iG)l2 in (2.17) appears to be the beginning of an 
explanation of the fact that in many formulas of 1 + 1 dimensional current 
algebra, quantum corrections have the effect of replacing k by k C2{G)12. In 
turn, this is probably related to the fact that in various integrable models in 1 -f 1 



242 


E Witten 


dimensions, such as the sine-gordon model, the WKB approximation is exact if 
one makes suitable and seemingly ad hoc changes in the values of the parameters, 
analogous to replacing khy k-\- C 2 (G)j 2 . 


Trivialization of the Tangent Bundle 

Now, let us discuss how the mysterious phase factor in (2.17) should be 

interpreted. 

First of all, rj (0) is the rj invariant of the L _ operator coupled to (i) some metric 
g on M, and (ii) the trivial gauge field A =0. Let d = dim G be the dimension of the 
gauge group G. Since the gauge field is trivial, the L_ operator consists of d copies 
of the purely gravitational Z,_ operator coupled to the metric only. Thus, as a 
preliminary, we write 

'7(0) = i5f- 7grav, (2.18) 

where ;/grav is the eta invariant of the purely gravitational operator. Our 
problematical phase factor is 

(2.19) 

Now, with a particular regularization of the Chern-Simons quantum field 
theory, we have obtained the formula (2.17) which contains the ambiguous phase 
factor A. The goal is to find a different regularization which will preserve general 
covariance. Two regularizations should differ by a local counterterm, and in this 
case, since the problem phase (2.19) depends on the background metric only, we 
want a counterterm that depends on the background metric only. It is easy to see 
that the counterterm with the right properties is a multiple of the gravitational 
Chern-Simons term, which is defined (by analogy with the Yang-Mills Chern- 
Simons term) as 

= J Tr(co A A CO A co). (2.20) 

Here co is the Levi-Civita connection on the spin bundle of M. ^ I {g) suffers from 
an ambiguity just similar to that of the Yang-Mills Chern-Simons action. To 
define I{g) as a number, one requires a trivialization of the tangent bundle of M. 
Although the tangent bundle of a three manifold can be trivialized, there is no 
canonical way to do this. Any two trivializations differ by an invariantly defined 
integer, which is the number of relative “twists”. The gravitational Chern-Simons 
functional has the property that if the trivialization of the tangent bundle of M is 
twisted by s units, I{g) transforms by 

I{g)-^ l{g) + 2Ks. ( 2 . 21 ) 

Now, the Atiyah-Patodi-Singer theorem says that the combination 


1 M 

2'7grav-^j2 2 % 


( 2 . 22 ) 


^ (2.20) is not the integral of an intrinsic local functional, so it would not usually arise as a 
counterterm. Whether or not “counterterra” is the right word, we will have to view (2.20) as a 
correction that must be added to the action if one wishes to work in the gauge DiA^ = 0 



Quantum Field Theory and the Jones Polynomial 


243 


is a topological invariant, depending that is on the oriented three manifold M with 
a choice of trivialization of the tangent bundle, but not on the metric of M. It is 
clear, therefore, what we must do. We replace ^(0)/2 in (2.17) by d times the 
combination that appears in (2.22) [the factor of d is the one that entered in (2. 1 9)], 
so (2.17) is replaced by 

Z = txp(^Ttd{^ + ^ ‘ ^)) ■ (2.23) 

So, finally, we can see that the Chern-Simons partition function, at least for large 
k, can be defined as a topological invariant of the oriented, framed three manifold 
M (a framed three manifold being one that is presented with a homotopy class of 
trivializations of the tangent bundle). 

The fact that it is necessary to specify a framing of the three manifold may look 
like a nuisance, but there is no real loss of information. From (2.21) we see that if 
the framing is shifted by s units, the partition function is transformed by 

Z Z • exp [iKis • ^ . (2.24) 

A topological invariant of framed, oriented three manifolds, together with a law 
for the behavior under change of framing, is more or less as good as a topological 
invariant of oreinted three manifolds without a choice of framing. 

Of course, all of the discussion in this section, and in particular (2.24), has been 
limited to the behavior at large k. In Sect. (4.5), we will see that the generalization 
of (2.24) to finite k is 

Z Z • exp i^nis ■ , (2.25) 

with c being the central charge of two dimensional current algebra with symmetry 
group G at level k. It is well known that the large k limit of c is exactly d. 

Moduli Spaces of Flat Connections 

There is still an important gap in the above discussion of the large k behavior. The 
formula (2.8) is really only valid if the determinants that appear are all non-zero. 
In fact, the flat connection determines a flat bundle E. The determinants in 
(2.8) are non-zero if and only if is such that the de Rham cohomology of M, 
with values in E, is zero. If H^{M,E)A^0, then the flat connection is not 
isolated but lies on a moduli space y of gauge inequivalent flat connections; and 
the proper evaluation of the path integral (2.1) leads not to the discrete sum (2.4) 
but to an integral on y. If H°(M, E) is not zero, then the fields <j), c and c in the 
above treatment have zero modes, and the gauge fixing requires more care. It is 
plausible that by more careful study of the path integral, the large k contribution 
of arbitrary flat connections can be extracted without assumptions about 
H* (M, E). But we will not attempt this. 

^ The crucial factor of 1/12 in (2.22) reflects the discrepancy between the Chem character 
e* = 1 -I- x^/2 + . . . that appears in gauge theory index theorems and the A genus 
(x/2)/sinh(jc/2) = 1 — x^/24 + . . . that appears in gravitational index theorems 



244 


E Witten 


Some Examples 

We will later on determine the partition functions of some simple three manifolds, 
giving results that can be compared to large k computations. For x S^,Z=\, 
for any G and any k. For and G = SU{2), we will obtain the formula ^ 



(2.26) 


Of course, on the only flat connection is the trivial connection, for which (2.8) is 

not valid, since (M, .F) + 0 in this case. For G — SU (2), the behavior Z ~ 
in (2.26) is probably the general behavior of the contribution of the flat connection 
for homology spheres (on which the flat connection is isolated); it would be 
interesting to know how to obtain this behavior from path integrals. In Donaldson 
and Floer theory, the trivial connection, which has a negative formal dimension, is 
the cause of many subtleties. The vanishing of (2.26) in the classical limit of large k 
appears to be an interesting quantitative reflection of the “negative dimension” of 
the trivial connection. 


2.1. Incorporation of Knots 

We now wish to consider the large k behaviour in the presence of knots. For 
simplicity, we will limit ourselves to the case of 5^, and an abelian gauge group 
G=U{\). Though the abelian gauge group is relatively trivial in the context of 
knot theory, it gives a quick and simple way to confirm the fact that the Chern- 
Simons action really does lead to topological invariants, and it also gives a simple 
context for explaining a technicality that is crucial in all that follows. 

In the abelian theory, the gauge field is simply a one form A and the 
Lagrangian is 

X = (2.27) 

on 

We pick some circles Q and some integers [corresponding to representations of 
the gauge group C/(l)]. As always in this paper, we assume Q does not intersect Cj 
for a =j= b.. We wish to calculate the expectation value of the product 

Y\ J (2.28) 

with respect to the Gaussian measure determined by e'-^. As was recently discussed 
by Polyakov (in a^ paper [23] in which he proposed to apply the Abelian Chern- 
Simons theory to high temperature superconductors), the result can be written in 
the form 

(W} = exp S J dx^ J dy . (2.29) 

Here one has identified a region IJ of 5"^ containing the knots with a region of three 
dimensional Euclidean space, and x\ are the Euclidean coordinates of U 

* The appearance of fc + 2 in this formula is presumably an illustration of the /t + Cj (G)/2 in (2. 1 7) 



Quantum Field Theory and the Jones Polynomial 


245 


evaluated along the knots. For a^b, the integral in (2.29) is essentially the Gauss 
linking number, which can be written as 

^ iCa ,C^)=~ \ dx^ J dy fiyfc • (2.30) 

As long as and Q do not intersect, 0 (Q, Q) is a well defined integer; in fact, it 
is the most classic invariant in knot theory. Thus, if we could ignore the term a = b, 
we would have 

< = exp (~ Y n, 0 (C„ C,)^ . (2.31) 

The appearance of the Gauss linking number illustrates the fact that the Chern- 
Simons theory does lead to topological invariants as we hope. But we have to 
worry about the term with a = b. This integral is ill-defined near x=y, how do we 
wish to interpret it? 

It is well known in knot theory that there is no natural and topologically 
invariant way to regularize the self-linking number of a knot. Polyakov in [23] 
used a regularization that is not generally covariant to get an answer that is 
interesting geometrically but not a topological invariant. We need a different 
approach for our present treatment in which general covariance is a primary goal. 
Though there is no completely invariant substitute for Polyakov’s regularization, 
in the sense that there is no way to get a natural topological invariant from the 
integral in (2.29) or (2.30) with a = b,WQ cannot simply throw away the self-linking 
term and its non-abelian generalizations (which are sketched in Fig. 3 a), since 
these terms are in fact not naturally zero. There is no reason to think that one 
could retain general covariance by dropping these terms. In the abelian theory, on 
a general three manifold M, on topological grounds the self-linking number can be 
a non-zero fraction, well-defined only modulo one. In such a case, it cannot be 
correct to set the self-linking number to zero, since it is definitely not zero. 
[Topologically, in such a situation, the self-linking number is well defined only 
modulo an integer, and this precision is definitely not good enough to evaluate 
(2.31).] In the non-abelian theory, we will get results later which amount to 
assigning definite, non-zero values to the non-abelian generalizations of the self- 
linking integral, so it would not be on the right track to try to throw these terms 
away. 

Topologically, it is clear what data are needed to make sense of the self-linking 
of a knot C. One needs to give a “framing” of C; this is a normal vector field along 
C. The idea is that by displacing C slightly in the direction of this vector field one 
gets a new knot C', and it makes sense to calculate the linking number of C and C. 
This can be defined as the self-linking number of the framed knot C. One can think 
of the framing as a thickening of the knot into a tiny ribbon bounded by C and C; 
this is how it is drawn in Fig. 3 b. It is clear that the self-linking number defined this 
way depends not on the actual vector field used to displace C to C' but only on the 
topological class of this vector field; and indeed by a “framing” we mean only the 
topological class. Though a choice of framing gives a definition of the self-linking 
number of a knot C, it is clear that by picking a convenient framing of C one can 



246 


E Witten 



Fig. 3a-c. The self-linking integral is, in a non-abelian theory, the first in an infinite series of 
Feynman diagrams, with gauge fields emitted and absorbed by the same knot, as in a; these all 
pose similar problems. A topologically invariant but not uniquely determined regularization can 
be obtained by supposing that each knot is “framed”, as in b. In c, the framing is shifted by 2 units 
by making a 2-fold twist 

get any desired answer for its self-linking number; as illustrated in Fig. 3 c, a t-fold 
twist in the framing of C will change its self-linking by t. ^ 

Physically, the role of the framing is that it makes possible what physicists 
would call a point-splitting regularization. This is defined as follows: when one has 
to do the self-linking integral in (2.29), one lets x run on C andy on C. This gives a 
well-defined integral, though of course it depends on the framing. In this paper, we 
will assume, without proof, that the framing gives sufficient information to make 
possible a consistent point-splitting regularization of all the non-abelian generali- 
zations of the self-linking integral, without further arbitrary choices. This question 
is, perhaps, comparable to the question of whether the non-abelian Chern-Simons 
action defines a sensible quantum theory in the first place (even without 
introducing Wilson lines as observables); neither of these questions will be tackled 
here. 

Of course, if it were always possible to pick a canonical framing of knots, then 
we could pick this framing and hide the question. On 5'^, there is a canonical 
framing of every knot; it is determined by asking that the self-linking number 
should be zero. (This makes the abelian linking integral zero, but not its non- 
abelian generalizations.) On general three manifolds, this cannot be done since the 
self-linking number may be ill-defined or may differ from an integer by a definite 
fraction (so that it does not vanish with any choice of framing). Even when the 
canonical framing does exist, it is not convenient to be restricted to using it, since 
natural operations (like the surgery we study in Sect. 4) may not preserve it. 

In general, therefore, we give up on finding a natural choice, and simply pick 
some framing and proceed. It would be rather unpleasing if the “physical” results 
depended uncontrollably on the framing of knots. What saves the day is that 
although we cannot in general make a natural choice of the framing, we can state a 
general rule for how expectation values of Wilson lines change under a change of 
the framing. First of all, let us note that while, in general, there is no canonical zero 
in the set of possible framings of a knot in a three manifold, if one compares two 
framings they always differ by a definite integer, which is the relative twist in going 

® The discussion should make it clear that the need to frame knots is analogous to the need to 
frame three manifolds, as found in the last section. This hopefully justifies the use of the same 
word “framing” in each case 



Quantum Field Theory and the Jones Polynomial 


247 


around the knot (Fig. 3 c). (That is, in general there is no natural way to count how 
many times the ribbon in Fig. 3 b is twisted, but there is a natural local operation of 
adding t extra twists to this ribbon.) In the abelian theory, it is clear from (2.29) 
and (2.30) how the partition function transforms under a change of framing. If we 
shift the framing of the link by t units, its self-linking number is increased by t, 
and the partition function is shifted by a phase 

exp (2nit ■ (n^Jk)) ■ <W} . (2.32) 

The nonabelian analog of that result will be derived in Sect. 5.1; the transform- 
ation law in the non-abelian case is 

(W}-*Qxp(2Tiit ■ h) (IV), (2.33) 

where h is the conformal weight of a certain primary field in 1 -1- 1 dimensional 
current algebra. This result, though it may seem rather technical, is a key 
ingredient enabling the Chern-Simons theory to work. It means that although we 
need to pick a framing for every link, because the self-linking integrals have no 
natural definition otherweise, there is no loss of information since we have a 
definite law for how the partition functions transform under change of framing. 

Actually, it can be shown [1 3] that the structure of rational conformal field 
theory requires non-trivial monodromies. In the relationship that we will develop 
between the 2 -f 1 dimensional Chern-Simons theory and rational conformal field 
theory in 1 + 1 dimensions, the need to frame all knots is the 2 + 1 dimensional 
analog of the monodromies that arise in 1 + 1 dimensions. [This will be clear in the 
derivation of (2.33).] Were it not for the seeming nuisance that knots must be 
framed to define the Wilson lines as quantum observables, one would end up 
proving that the Jones knot invariants were trivial. 

An alternative description may make the physical interpretation of the 
framing of knots more transparent. A Wilson line can be regarded as the space- 
time trajectory of a charged particle. In 2 + 1 dimensions, it is possible for a 
particle to have fractional statistics, meaning that the quantum wave function 
changes by a phase under a 27r rotation. (See [30] for a discussion of these 
issues.) If one wishes to compute a quantum amplitude with propagation of a 
particle of fractional statistics, it is not enough to specify the orbit of the particle; it 
is necessary to also count the number of 2;^ rotations that the particle undergoes in 
the course of its motion. Equations (2.32) and (2.33) mean that the particles 
represented by Wilson lines in the Chern-Simons theory have fractional statistics 
with 3 = nll2k in the abelian theory or <5 = ^ in the non-abelian theory. This 
fractional statistics is the phenomenon claimed by Polyakov in [23], so in essence 
we agree with his substantive claim, though we prefer to exhibit this phenomenon 
in the context of a generally covariant regularization, where it appears in the 
behavior of Wilson lines under change of framing. 

In this section, we have obtained some important evidence that the Chern- 
Simons theory can be regularized to give invariants of three manifolds and knots. 
We have also obtained the important insight that doing so requires picking a 
homotopy class of trivializations of the tangent bundle, and a “framing” of all 
knots. To actually solve the theory requires very different methods, to which we 
turn in the next section. 



248 


E Witten 



iCr , 


> 

f > 

R 

R 


Fig. 4 a and b. Cutting a three manifold M on an intermediate Riemann surface £ is indicated in 
part a. Wilson lines fV on Af may pierce £ and if so £ comes with certain “marked points”, with 
representations attached. Locally, near £, M looks like £x R'-, indicated in part b 


3. Canonical Quantization 

The basic strategy for solving the Yang-Mills theory with Chern-Simons action on 
an arbitrary three manifold Mis to develop a machinery for chopping M in pieces, 
solving the problem on the pieces, and gluing things back together. So to begin 
with we consider a three manifold M, perhaps with Wilson lines, as in Fig. 4a. We 
“cut” M along a Riemann surface E. Near the cut, M looks like Ex R^, and our 
first step in learning to understand the theory on an arbitrary three manifold is to 
solve it on Z" X 

The special case of a three manifold of the form Z x R Ms tractable by means of 
canonical quantization. Canonical quantization on Z x R ^ will produce a Hilbert 
space 3^2, “the physical Hilbert space of the Chern-Simons theory quantized on 
Z”. These will turn out to be finite dimensional spaces, and moreover spaces that 
have already played a noted role in conformal field theory. In rational conformal 
field theories, one encounters the “conformal blocks” of Belavin, Polyakov, and 
Zamolodchikov. Segal has described these in terms of “modular functors” that 
canonically associate a Hilbert space to a Riemann surface, and has described in 
algebra-geometric terms a particular class of modular functors, which arise in 
current algebra of a compact group G at level k [16]. The key observation in the 
present work was really the observation that precisely those functors can be 
obtained by quantization of a three dimensional quantum field theory, and that 
this three dimensional aspect of conformal field theory gives the key to 
understanding the Jones polynomial. 


It is conventional in physics to call vector spaces obtained in this fashion “Hilbert spaces”, and 
we will follow this terminology. In fact, the claim that comes most naturally from path integrals 
and that we will actually use is only that is a vector space canonically associated with £, and 
exchanged with its dual when the orientation of £ is reversed. However, a Hilbert space structure 
is natural in the Hamiltonian viewpoint, and in the particular problem we are considering here, an 
inner product on is important in more delicate aspects of conformal field theory; such an inner 
product gives a “metric on the flat vector bundle” in the language of Friedan and Shenker [31]. 
According to Segal [16], in fact has a canonical projective Hilbert space structure 



Quantum Field Theory and the Jones Polynomial 


249 


Actually, the general situation that must be studied is that in which possible 
Wilson lines on M are “cut” by Z, as in the figure. In this case Z is presented with 
finitely many marked points Pi,... with a G representation Pj- assigned to each 
Pi (since each Wilson line has an associated representation). To this data - an 
oriented topological surface with marked points, and for each marked point a 
representation of G - we wish to associate a vector space. This is also the general 
situation that arises in conformal field theory - the marked points are points at 
which operators with non-vacuum quantum numbers have been inserted. If one 
reverses the orientation of Z (and replaces the representations Pj associated with 
the marked points with their complex conjugates) the vector space must be 
replaced with its dual. 

The Canonical Formalism. At first sight, (1.3) might look like a typically 
intractable nonlinear quantum field theory, but this is far from being so. Working 
on ZxR \ it is very natural to choose the gauge Aq = 0 (with Aq being the 
component of the connection in the P^ direction). In this gauge we immediately 
see that the Lagrangian becomes quadratic. It reduces to 

dt {pjTr Ai4- A). (3.1) 

Stt-’ ^ 'dt ^ ^ ^ 

For the time being we will ignore extra complications due to Wilson lines that may 
be present on iJ x P ^ From (3.1) we may deduce the Poisson brackets, ® 

4 IT 

{At (x), A^ (P)} = ^ • fiu (3-2) 

Before rushing ahead to quantize these commutation relations, we should 
remember that the system is subject to a “Gauss law” constraint, which is 
Syf ! 6 Aq = 0, or (ignoring the Wilson lines) 

£''P^ = 0. (3.3) 

This constraint equation is nonlinear (since P contains a quadratic term), and - as 
(3. 1 ) is certainly a free theory - this nonlinearity is what remains of the underlying 
nonlinearity of (1.3). 

In quantum field theory, one very often quantizes first and then imposes the 
constraints. The situation that we are considering here is a situation in which it is- 
far more illuminating to first impose the constraints and then quantize. For the 
phase space of connections Af (.v) without the constraints is an infinite 
dimensional phase space; imposing the constraints will reduce us to a rather subtle 
but eminently finite dimensional phase space M. The problem that faces us here, 
of reducing from .//q to Ji by imposing the constraints (3.2), has been studied 
before - and has proved to have extremely rich properties - in the v/ork of Atiyah 
and Bott on equivariant Morse theory, two dimensional Yang-Mills theory, and 


® This is a typical problem in which it is not appropriate to “introduce canonical momenta”. The 
purpose of introducing such variables is to reexpress a given Lagrangian in a form which is first 
order in time derivatives, but (3.1) is already first order in time derivatives. The variables in (3.1) 
are already canonically conjugate, as indicated in the following equation 



250 


E Witten 


the moduli space of holomorphic vector bundles [33]. In our present investigation, 
this familiar problem appears from a novel three dimensional vantage point. 

It is necessary to recall the nature of constraint equations in classical physics. 
The constraints (3.2) are functions that should vanish, but they also generate 
gauge transformations via Poisson brackets. Imposing the constraints means two 
things classically; First, we restrict ourselves to values of the canonical variables 
for which the constraint functions vanish; and second, we identify two solutions of 
the constraint equations if they differ by a gauge transformation. In the case at 
hand, the first step means that we should consider only “flat connections”, that is, 
connections for which F?- = 0. The second step means that we identify two flat 
connections if they differ by a gauge transformation. Taking the two steps 
together, we see that the physical phase space, obtained by imposing the 
constraints (3.2), is none other than the moduli space of flat connections on 27, 
modulo gauge transformations. Such flat connections are completely character- 
ized by the “Wilson lines”, that is, the holonomies around non-contractible loops 
on 27. A simple count of parameters shows that on a Riemann surface of genus 
g > 1, the moduli space of flat connections modulo gauge transformations has 
dimension {2g — 2)-d, where d is the dimension of the group G. 

The topology of M is rather intricate (and this was in fact the main subject of 
interest in [33]). On general grounds inherits a symplectic structure (that is, a 
structure of Poisson brackets) from the symplectic structure present on before 
imposing the constraints. ^ is a compact space (with some singularities), and in 
particular its volume with the natural symplectic volume element is finite. Since in 
quantum mechanics there is one quantum state per unit volume in classical phase 
space, the finiteness of the volume of jM means that the quantum Hilbert spaces 
will be finite dimensional. We would like to determine them. 

3.1. The Holomorphic Viewpoint 

Quantization of classical mechanics is usually carried out by separating the 
canonical variables into “coordinates”, q\ which are a maximal set of real 
commuting variables, and “momenta”, p\ which are conjugate to the q\ The 
quantum Hilbert space is then the space 3^ of square integrable functions of the q\ 

Such a scheme definitely requires a noncompact phase space of infinite 
volume, since - though the q^ may take values in a compact space - the p^ are 
definitely unbounded. Accordingly, the space 3^ is infinite dimensional. 

Quantizing a compact, finite volume phase space, such as the moduli space M 
of flat connections modulo gauge transformations, is quite a different kind of 
problem. It has no known general solution, but there is one important class of 
cases in which there is a natural notion of quantization. This arises in the case in 
which is a Kahler manifold, and the symplectic structure on ^ is the curvature 
form that represents the first Chern class of a holomorphic line bundle L endowed 
with some metric. In this case, one carries out quantization not by separating the 
variables in phase space into “coordinates” and “momenta”, ^’s and p’s, but by 
separating them into holomorphic and anti-holomorphic degrees of freedom, 
essentially z ~ g -f- ip and Z'^q — ip. The quantum Hilbert space is then a 
suitable space of holomorphic “functions”. More exactly, if is the space of 



Quantum Field Theory and the Jones Polynomial 


251 


holomorphic sections of the line bundle L. If ^ is compact, this latter space will be 
finite dimensional. In our problem, with ^ being the moduli space of flat 
connections modulo gauge transformations on an oriented smooth surface Z, is 
there a natural Kahler structure on Jll The answer is crucial for all that follows. 
There is not quite a natural Kahler structure on Jf, but there is a natural way to 
obtain such structures. Once one picks a complex structure J on Z, the moduli 
space of flat connections can be given a new interpretation — it is the moduli 
space of stable holomorphic Gq bundles on Z which are topologically trivial {Gq is 
the complexification of the gauge group G). Let us refer to the latter space as J{j . 
^jr is naturally a complex Kahler (and in fact projective algebraic) variety. Upon 
picking a linear representation of G (for our purposes it is convenient to pick a 
representation with the smallest value of the quadratic Casimir operator, e.g. the 
iV dimensional representation of SU {N) or the adjoint representation of and 

passing from a principal Gc bundle to the associated vector bundle, we can think 
of as the moduli space of a certain family of holomorphic vector bundles. For 
G = SU (N), Jij is simply the moduli space of all stable rank N holomorphic 
vector bundles of vanishing first Chern class. 

The symplectic form on Ji that appears in (3.1) or (3.2) without picking a 
complex structure on Z has a very special interpretation in holomorphic terms 
once we do pick such a complex structure. Let us recall the notion [34] of the 
determinant line bundle of the U operator. The d operator on Z can be “twisted” 
by any holomorphic vector bundle, parametrizes a family of holomorphic 
vector bundles on Z, and thus it can be regarded as parametrizing a family of ^ 
operators. Taking the determinant line gives a line bundle L over the base space 
J^j of this family. Furthermore [34], the Dirac determinant gives a natural metric 
on L, and the first Chern class of L, computed with this metric, is precisely the 
symplectic form that appears in (3.1) or (3.2), provided k = l. For general ife, the 
symplectic form that appears in (3.1) or (3.2) represents the first Chern class of the 
k*** power of the determinant line bundle. ^ 

Thus, all of the conditions are met for a straightforward quantization of (3.1), 
taking into account the constraints (3.3). The constraints mean that the classical 
space to be quantized is the moduli space of flat connections. Picking an 
arbitrary complex structure J on Z, becomes a complex manifold, and the 
symplectic form of interest represents the first Chern class of L®*, the tensor 
power of the determinant line bundle. The quantum Hilbert space is thus the 
space of global holomorphic sections of L®*^. 

3.2. A Flat Vector Bundle on Moduli Space 

This gives an answer to the problem of canonically quantizing the Chem-Simons 
theory on r x but a crucial point now requires discussion. 


®This description is valid for the gauge group G — SU(N), but in general the following 
modification is needed. For groups other than SU{N) the determinant line bundle L is not the 
fundamental line bundle on M but a tensor power thereof. For instance, for G — fg , there is a line 
bundle L with (I')® =; L. It is then L' whose first Chern class corresponds to (3.1) or (3.2) with 

k = \ 



252 


E Witten 


Quantizing (3.1), with the constraints (3.3), is a problem that can be naturally 
asked whenever one is given an oriented smooth surface £. Beginning with a 
generally covariant Lagrangian in three dimensions, we were led to this problem in 
a context in which it was not natural to assume any metric or complex structure on 
r. However, to solve the problem and construct it was very natural to pick a 
complex structure / on 2". Thus, our description of depends on the choice of /, 

and what we have called might perhaps be better called . As J varies, the 
vary holomorphically with /, and thus we could interpret this object as a 
holomorphic vector bundle on the moduli space of complex Riemann surfaces. 
But since is the answer to a question that depends on E and not on /, we 
would like to believe that likewise the canonically depend only on E and not 
on /. The assertion that the are canonically independent of /, and depend 
only on E, is the assertion that the vector bundle on moduli space given by the 
has a canonical flat connection that permits one to identify the fibers. Such 
“flat vector bundles on moduli space” first entered in conformal field theory 
somewhat implicitly in the differential equations of Belavin, Polyakov, and 
Zamolodchikov [32]. They were discussed much more explicitly by Friedan and 
Shenker [31], who proposed that they would play a pivotal role in conformal field 
theory, and they have been prominent in subsequent work such as [1 2, 1 3]. At least 
in one important class of examples, we have just met a natural origin of “flat 
vector bundles on moduli space”. The problem “quantize the Chern-Simons 
action” can be posed without picking a complex structure, so the answer is 
naturally independent of complex structure and thus gives a “flat bundle on 
moduli space”. The particular flat bundles on moduli space that we get this way 
are those that Segal has described [16] in connection with conformal field theory; 
Segal also rigorously proved the flatness, which is explained somewhat heuristi- 
cally by the physical argument sketched above. (Because of the conformal 
anomaly, this bundle has only a project! vely flat connection, with the projective 
factor being canonically odd under reversal of orientation.) 

The role of these flat bundles in conformal field theory is as follows. If one 
considers current algebra on a Riemann surface, with a symmetry group G, at 
“level” k, then one finds that in genus zero the Ward identities uniquely determine 
the correlation functions for descendants of the identity operator, but this is not so 
in genus ^ 1 . On a complex Riemann surface E of genus ^ 1 , the space of solutions 
of the Ward identities for descendants of the identity is a vector space ^ > which 
might be called the “space of conformal blocks”. Segal calls the association 
27 a “modular functor”, and has given an algebra-geometric description of 
the modular functors that arise in current algebra. In quantizing the Chern- 
Simons theory we have exactly reproduced this description! This is then the secret 
of the relation between current algebra in 1 -f 1 dimensions and Y ang-Mills theory 
in 2 -I- 1 dimensions: the space of conformal blocks in 1 -f 1 dimensions are the 
quantum Hilbert spaces obtained by quantizing a 2 -I- 1 dimensional theory. It 
would take us to far afield to explain here the algebra-geometric description of the 
space of conformal blocks. Suffice it to say that when one tries to use the Ward 
identities of current algebra to uniquely determine the correlation functions of 
descendants of the identity on a curve E of genus ^1, one meets an obstruction 
which involves the existence of non-trivial holomorphic vector bundles on E', the 



Quantum Field Theory and the Jones Polynomial 


253 


Ward identities reduce the determination of the correlation functions to the choice 
of a holomorphic section of Z-®* over the moduli space of bundles. 

It seems appropriate to conclude this discussion with some remarks on the 
formal properties of the association i;" . It is good to first think of the functor 

{L,R) which to a Riemann surface E associates its first de Rham 
cohomology group. This functor is defined for every smooth surface E, 
independent of complex structure. A diffeomorphism of E induces a linear 
transformation on (E, R), so (E, R) furnishes in a natural way a 
representation of the mapping class group. The formal properties of the functors 
E that come by quantizing the Chern-Simons theory are quite analogous. 
Though a complex structure 7 on 2" is introduced to construct , the existence of 

a natural projectively flat connection on the moduli space of complex structures 
permits one locally to (projectively) identify the various and forget about the 
complex structure. One might think that the global monodromies of the flat 
connection on moduli space would mean that globally one could not forget the 
complex structure, but this is not so; these monodromies just correspond to an 
action of the purely topological mapping class group, so that the formal properties 
of 3^2 are just like those of {E,R). 

3.3. Inclusion of Wilson Lines 

So far, we have discussed the quantization of the Chern-Simons theory on a 
Riemann surface E without Wilson lines. Now we wish to include the Wilson lines, 
which, as in Fig. 4b, pierce E in some points Pj; associated with each such point is a 
representation R^. Quantizing the Chern-Simons theory in the presence of the 
Wilson lines should give a Hilbert space J/^x-.PuRt that is canonically associated 
with the oriented surface E together with the choice of P; and P;. 

It is pretty clear what problem in conformal field theory this should 
correspond to. Instead of simply considering correlation functions of the 
descendants of the identity, we should consider in the conformal field theory 
primary fields transforming in the R^ representations of G. With these fields (or 
their descendants) inserted at points P; on E, one gets in conformal field theory a 
more elaborate space 3^x;Pi,Ri of conformal blocks. Again, there is an algebra- 
geometric description of this space [16], and this is what we should expect to 
recover by quantizing the Chern-Simons theory in the presence of the Wilson lines. 

I will now briefly sketch how this works out, deferring a fuller treatment for 
another occasion. First of all, the Wilson lines correspond to static non-abelian 
charges which show up as extra terms in the constraint equations. So (3.3) is 
replaced by 

(3.4) 

where P^, 5=1 ...r are the points at which static external charges have been 
placed, and 7j“), a=l ...dimG are the group generators associated* with the 
external charges. Now, a naive attempt to quantize (3.1) with the generalized 
constraints (3.4) would run into extremely unpleasant difficulties. One could try to 
quantize first and then impose the constraints, but this is difficult to see through 



254 


E Witten 


even in the absence of the external charges. Alternatively, one can try to impose 
the constraints at the classical level and then quantize, as we did above. But it is 
hard to make sense of (3.4) as constraints in the classical theory; the solution Af of 
(3.4) cannot be an ordinary c-number connection, since non-commuting 
operators appear on the right-hand side. It is clear that to solve (3.4), Af would 
have to be some sort of “^-number connection”, whose holonomy would 
presumably be an element of a “quantum group”, not an ordinary classical group. 
Indeed, it seems likely that the theory of quantum groups [35] can be considered to 
arise in this way. 

However, there is a much better way to quantize the Chern-Simons theory with 
static charges. We certainly wish to impose (3.4) at the classical level. This cannot 
be done directly, since on the right-hand side there appear quantum operators. A 
useful point of view is the following. A representation i?,- of a group G should be 
seen as a quantum object. This representation should be obtained by quantizing a 
classical theory. The Borel-Weil-Bott theorem gives a canonical way to exhibit for 
every irreducible representation 7? of a compact group G a problem in classical 
physics, with G symmetry, such that the quantization of this classical problem 
gives back R as the quantum Hilbert space. One introduces the “flag manifold” 
GjT, with T being a maximal torus in G, and for each representation R one 
introduces a symplectic structure cor on G/T, such that the quantization of the 
classical phase space GjT, with the symplectic structure gives back the 
representation R. Many aspects of representation theory find natural explan- 
ations by thus regarding representations of groups as quantum objects that are 
obtained by quantization of classical phase spaces. 

In the problem at hand, this point of view can be used to good effect. We 
extend the phase space J^q of G connections on E by including at each marked 
point Pi a copy of G/T, with the symplectic structure appropriate to the i?; 
representation. The quantum operators 7]") that appear on the right of (3.4) can 
then be replaced by the classical functions on G/T whose quantization would give 
back the . The constraints (3.4) then make sense as classical equations, and the 
analysis can be carried out just as we did without marked points, though the 
details are a bit longer. Suffice it to say that after imposing the classical 
constraints, one gets a finite dimensional phase space that incorporates the 

static charges; a point on this space is a flat G connection on I, with a reduction of 
structure group to T at the points Upon picking an arbitrary conformal 
structure on E, this phase space can be quantized. In this way one gets exactly 
Segal’s description of the space of conformal blocks in current algebra in a general 
situation with primary fields in the P; representation inserted at the points Pj. (In 
current algebra at level k, one only permits certain representations, the “integrable 
ones”. If one formally tries to include other representations, the Ward identities 
show that they decouple [36]. According to Segal, the analogous statement in 
algebraic geometry is that the appropriate line bundle over r. has no non-zero 

holomorphic sections unless the all correspond to integrable representations. 
For the Chern-Simons theory, this means that unless the representations P,- are all 
integrable, the zero vector is the only vector in the physical Hilbert space.) 

Finally, let us note that the Borel-Weil-Bott theorem should not be used simply 
as a tool in quantization. It should be built into the three dimensional description. 



Quantum Field Theory and the Jones Polynomial 


255 


One should use the theorem to replace the Wilson lines (1.5) that appear in (1.6) 
with a functional integral over maps of the circle S into G/ T (or actually an integral 
over sections of a G/T bundle, twisted by the restriction to S of the G-bundle E). 
This gives a much more unified formalism. 


3.4. The Riemann Sphere with Marked Points 

The above description may seem a little bit dense, and we will supplement it by 
giving a simple intuitive description of the physical Hilbert space the 

important case of genus zero. Let E be an oriented surface of genus zero, with 
static charges in the 7?,- representation at points P,-. Let us consider the case of very 
large k. Now, the gauge coupling in (1.3) is of order Xjk, so for large k we are 
dealing with very weak coupling. Rather naively, one might believe that for 
extremely weak coupling the physical Hilbert space is the same as it would be if the 
charges were not coupled to gauge fields. If so, the physical Hilbert space would be 
simply the tensor product Jfg = ®iRi of the Hilbert spaces 7?,- of the individual 
charges. However, there is a key error here. No matter how weak the gauge 
coupling may be, we must remember that in a closed universe the total charge must 
be zero (since the electric flux has nowhere to go). The total charge being zero 
means in a nonabelian theory that all of the charges together must be coupled to 
the trivial representation of G. So the physical Hilbert space, for large k, is 
precisely the G-invariant subspace of or 

^ = Inv(®iPi)- (3-5) 

This is a familiar answer in conformal field theory for the space of conformal 
blocks obtained, in the large k limit, in coupling representations /?;. Consider- 
ations of conformal field theory also show that for finite k the correct answer is 
always a subspace of (3.5). The most important modification of (3.5) that arises 
for finite k (and is explained algebra-geometrically in [16]) is that J4f is zero unless 
the Ri correspond to integrable representations of the loop group; in what follows 
a restriction to such representations is always understood. 

Now we consider some important special cases. 

(i) For the Riemann sphere with no marked points, the Hilbert space is one 
dimensional. This is well known in conformal field theory - for descendants of the 
identity on the Riemann sphere, there is only one conformal block. 

(ii) For the Riemann sphere with one marked point in a representation R; , the 
Hilbert space is one dimensional if R,- is trivial, and zero dimensional otherwise. 

(iii) For the Riemann sphere with two marked points with representations Rj 
and Rj, the Hilbert space is one dimensional if Rj is the dual of Rj (so that there is 
an invariant in R,- (x) Rj) and zero dimensional otherwise. Again, this is well known 
in conformal field theory. 

(iv) For the Riemann sphere with three marked points in representations R,-, 
Rj, and R^, the dimension of Jf is the number for which Verlinde has 
proposed [12] and Moore and Seiberg have proved [1 3] rather striking properties. 
Here, Nij,, may in general be less than its large k limit which is the dimension of 
(3.5). 



256 


E Witten 


(v) From the results of Verlinde, the dimensions of the physical Hilbert spaces 
for an arbitrary collection of marked points on can be determined from a 
knowledge of the But let us consider a particularly important special case. 

Suppose that there are four external charges, and that the representations are 
R, R, R, and R. If the decomposition of ® /? is 

S 

R0R = ®Et, (3.6) 

i= 1 

with the Ei being distinct irreducible representations of G, then the physical 
Hilbert space at large k will be s dimensional, since the possible invariants in 
R (S) R ® R 0 R stre uniquely fixed by giving the representation to which i? 0 i? is 
coupled. (For small k the dimension of might be less than 5 .) In understanding 
the knot polynomials, an important special case is that in which G is SU (N) and R 
is the defining N dimensional representation. In that case, 5 = 2 and the physical 
Hilbert space is two dimensional (except for k = 1 where it is one dimensional). 


4. Calculability 

Our considerations so far may have seemed somewhat abstract, and we would 
how like to show that in fact these considerations can actually be used to calculate 
things. As an introduction to the requisite ideas, we will first deduce a certain 
theoretical principle that is of great importance in its own right. 

Consider, as in Fig. 5 a, a three manifold M which is the connected sum of two 
three manifolds Mi and M 2 , joined along a two sphere S^. There may be knots in 
Ml or M 2 , but if so they do not pass through the joining two sphere. If for every 
three manifold X we denote the partition function or Feynman path integral ( 1 . 6 ) 
as Z (Z), then we wish to deduce the formula 

Z(M) ■ Z{S^) = Z{Mi) ■ Z(Mi) (4.1) 

[it being understood that Z(S^) denotes the partition function of a three sphere 
that contains no knots]. This can be rewritten 

Z(M) ZjMi) ZjM^) , 4 .. 

Z(5^) Z{S^) Z{S^) ■ ^ ’ 

In some special cases, (4.2) is equivalent or closely related to known formulas. If 
Ml and M 2 are copies of with knots in them, then the ratios appearing in (4.2) 
turn out to be the knot invariants that appear in the Jones theory, and (4.2) 
expresses the fact that these invariants are multiplicative when one takes the 
disjoint sum of knots. If Mj and M 2 are arbitrary three manifolds without knots, 
then (in view of our discussion in Sect. 2) (4.2) is closesly related to the 
multiplicativity of Reidemeister and Ray-Singer torsion under connected sums. 

So let us study Fig. 5 a using the general ideas of quantum field theory. On the 
left of this figure, we see a three manifold Mj with boundary S^. According to the 
general ideas of quantum field theory, one associates a “physical Hilbert space” 
with this S^\ as we have seen in the last section, it is one dimensional. The 
Feynman path integral on Mj determines a vector/ in Jf”. Likewise, on the right of 



Quantum Field Theory and the Jones PolynomiaJ 


257 




c Ml Ml 

Fig. 5 a— c. In a is sketched a three manifold M which is the connected sum of two pieces Mi and 
M 2 , ]oined along a sphere S^. Similarly, a three sphere can be cut along its equator, as in b. 
Cutting both M and 5^ as indicated in a and b, the pieces can be rearranged into the disconnected 
sum of Ml and M 2 , as in c 


Fig. 5 a we see a three manifold M 2 whose boundary is the same with opposite 
orientation; its Hilbert space is canonically the dual of The path integral 
on M 2 determines a vector xj/ mM”, and according to the general ideas of quantum 
field theory, the partition function of the connected sum M is 

Z{M) = {x.W)- (4.3) 

The symbol (j, y/) denotes the natural pairing of vectors We cannot 

evaluate (4.3), since we do not know x or W- Instead, let us consider some 
variations on this theme. The two sphere that separates the two parts of Fig. 5 a 
could be embedded in 5"^ in such a way as to separate into two three balls 
and . The path integrals on Bj^ and Bg would give vectors v and v' in 3^ and Jf ', 

and the same reasoning as led to (4.3) gives 

= (4.4) 

Again, we do not know v or v' and cannot evaluate (4.4). But we can say the 
following. As 3^ is one dimensional, p is a multiple of;^; likewise, since 3^ ' is one 
dimensional, v’ is a multiple of y/. It is then a fact of one dimensional linear algebra 
that 

ix, y/) • (v, v'} = (x, v') ■ (v, yf) . (4.5) 

The two terms on the right-hand side of (4.5) are respectively Z (M^) and Z (M 2 ), 
as we see in Fig. 5c. So (4.5) is equivalent to the desired result (4.1). 

One may wonder what is the mysterious object Z (S^) that is so prominent in 
(4.1). Can it be set to one? Actually, the axioms of quantum field theory are strong 
enough so that the value of Z(S^) is uniquely determined and cannot be 
postulated arbitrarily; as we will see later it can be calculated from the theory of 
affine Lie algebras. For G = SU(2) the formula has been given in (2.26). 

As a special case of (4.1), pick s irreducible representations of G, say , . . . R^, 
and consider a link in iS^ that consists of s unlinked and unknotted circles C,-, with 



258 


E Witten 



Fig. 6. A three 
associated with 
various ways to 


sphere with 3 unlinked and unknotted circles C,-, 
representations ... R^. The figure can be cut in 
separate the circles 


one of the Ri associated with each circle. This is indicated in Fig. 6. Denote the 
partition function of with this collection of Wilson lines as Z (S^; Ci , . . . Q) 
(the representations Ri being understood). Then by cutting the figure to separate 
the circles, and repeatedly using (4.1), we learn that 


Z(5^Ci,...Q_ * Z(S^;C,) 
Z(S^) l\ Z(S^) 


(4.6) 


If we introduce the normalized expectation value of a link L, defined by 
iLy = Z(S^;L)/Z{S^), then (4.6) becomes 

<C....C,>=n<C»> (4.7) 

k 

for an arbitrary collection of unlinked, unknotted Wilson lines on S^. 

In knot theory there is another notion of connected sum, the “connected sum 
of links”. The Jones invariants also have a simple multiplicative behavior under 
this operation, as we will sketch briefly at the end of Sect. 4.5. 


4.1. Knots in 

We will now describe the origin of the “skein relation” which can be taken as the 
definition of the knot polynomials for knots on S^. (A special case of the skein 
relation was first used by Conway in connection with the Alexander polynomial.) 

Consider a link L on a general three manifold M, as indicated in Fig. 7 a. The 
components of the link are associated with certain representations of G, and we 
wish to calculate the Feynman path integral (1.6), which we will denote as Z (L) 
(with the representations understood). We will evaluate it by deducing an 
algorithm for unknotting knots. If the lines in Fig. 7 could pass through each other 
unimpeded, all knots could be unknotted. As it is, this is prevented by some 
unfortuitous crossings, such as the one circled in the figure. Let us draw a small 
sphere about this crossing, cut it out, and study it more closely. This cuts M into 
two pieces, which after rearrangement are shown in Fig. 7 b as a complicated piece 
Ml shown on the left of the figure and a simple piece shown on the right. 
consists of a three ball with boundary S^; on this boundary there are four marked 
points that are connected by two lines in the interior of the ball. 

To make the discussion concrete, let us suppose that the gauge group is 
G = SU (N) and that the Wilson lines are all in the defining N dimensional 
representation of SU (AT), which we will call R. Then, as we saw at the end of the 



Quantum Field Theory and the Jones Polynomial 


259 



Fig. 7a-c. A link C on a general three manifold M is sketched in a. A small sphere 5 has been 
drawn about an inconvenient crossing; it cuts M into a simple piece (the interior of S) and a 
complicated piece. In b, the picture is rearranged to exhibit the cutting of M more explicitly; the 
two pieces now appear on the left and right as Mj^ (the complicated piece whose details are not 
drawn) and (the interior of S). The key to the skein relation is to consider replacing Mjj’with 
some substitutes, as shown in c 


last section, the physical Hilbert spaces and associated with the 
boundaries of and are two dimensional. 

The strategy is now the same as the strategy which led to the multiplicativity 
relation (4.1). The Feynman path integral on determines a vector in . The 
Feynman path integral on Mr determines a vector y/ in . The vector spaces 

and Jif^R (which are associated with the same Riemann surface with opposite 
orientation) are canonically dual, and the partition function or Feynman path 
integral Z (L) is equal to the natural pairing 

Z{L) = ix,¥)- (4.8) 

We cannot evaluate (4.8)^, since we know neither nor y/. The one thing that we do 
know, at present, is that (for the groups and representations we are considering) 
this pairing is occurring in a two dimensional vector space. A two dimensional 
vector space has the marvelous property that any three vectors obey a relation of 



260 


E Witten 


linear dependence. Thus, given any two other vectors and 1^2 in there 
would be a linear relation 

aiff + /3i/^i + yy/2 = ^, (4.9) 

where oc, and y are complex numbers. Physically, there is a very natural way to 
get additional vectors in If one replaces Mjj in Fig. 7b by any other three 
manifold X with the same boundary (and with suitable strings in X connecting the 
marked points on the boundary of Mr), then the Feynman path integral on X 
gives rise to a new vector in . Picking any two convenient three manifolds X^ 
and X 2 for this computation gives vectors y/^ and ^^2 that can be used in (4.9). We 
will consider the case in which Xi and X 2 are the same manifolds as Mr but with 
different “braids” connecting the points on the boundary; this is indicated in 
Fig. 7 c. 

Once and y /2 are obtained in this way, (4.9) has the obvious consequence 
that 

= (4.10) 

The three terms in (4.10) have a “physical” interpretation, evident in Fig. 7c. By 
gluing Mj^ back together with Mr or one of its substitutes X^ and X 2 , one gets back 
the original three manifold M, but with the original link L replaced by some new 
links Li and Z, 2 - Thus, (4.10) amounts to a relation among the link expectation 
values of interest, namely 

ocZ(L) + pZ(L,) + yZ(L2) = 0 . (4.11) 

This recursion relation is often drawn as in Fig. 8. The meaning of this figure is 
as follows. If one considers three links whose plane projections are identical 
outside a disc, and look inside this disc like the three drawings in the figure, then 
the expectation values of those links, weighted with coefficients a, and y, add to 
zero. 

It is well known in knot theory that (4.1 1) uniquely determines the expectation 
values of all knots in 5^. For convenience we include a brief explanation of this. 
One starts with a plane projection of a knot, indicated in Fig. 9. The number p of 
crossings is finite. Inductively, suppose that all knot expectation values for knots 
with at most p — \ crossings have already been computed. One wishes to study 
knots with p crossings. If one had ^ = 0 in (4.1 1), one could at each crossing pass 
the two strands through each other with a factor of — y/a in replacing an over- 
crossing by an under-crossing. If this were possible, the lines would be effectively 
transparent, and one could untie all knots. As it is, yS + 0, but the term 
proportional to reduces the number of crossings, giving rise to a new link whose 
expectation value is already known by the induction hypothesis. 

This process reduces the discussion to the case /? = 0 where there are no 
crossings, and therefore we are dealing only with a certain number of unlinked and 
unknotted circles. For practice with (4.11), let us discuss this case explicitly. In 
Fig. 10, we sketch a useful special case of Fig. 8. The first and third links in links in 
Fig. 10 consist of a single unknotted circle, and the second consists of two 
unlinked and unknotted circles. If we denote the partition function for s unlinked 



Quantum Field Theory and the Jones Polynomial 


261 



Fig. 8. A recursion relation for links 



Fig. 9. A plane projection of a knot, with four crossings 



Fig. 10. A special case of the use of Fig. 8. The idea is that the three pictures are identical outside 
of the dotted lines, and look like figure (4.11) inside them 


and unknotted circles in the N dimensional representation of SU (N) as Z(S^; Cs), 
then (4.11) amounts in this case to the assertion that 

(a + y)Z(S^; C)+fiZ(S^;C^)==^0. (4.12) 

Together with (4.7), this implies that the expectation value of an unknotted 
Wilson line in the N dimensional representation of SU (N) is 

<0=-^. (4.13) 

Presently we will make this formula completely explicit by computing a, and y in 
terms of the fundamental quantum field theory parameters N and k. 

The induction sketched above expresses any knot expectation value as a 
rational function of a, p and y (a ratio of polynomials), after finitely many steps. It 
is in this sense that the Jones knot invariants and their generalizations are 
“polynomials”. While it is, as we have seen, comparatively elementary to prove 
that (4.11) uniquely determines the knot invariants, the converse is far less 
obvious. Equations (4.11) can be used in many different ways to obtain the 
expectation value of a given link, and one must show that one does not run into 

This is the only point at which (4.7) has to be used. The induction sketched in the previous 
paragraph reduces all computations for knots in to this special case without using (4.7) 



262 


E Witten 


any inconsistency. While this has been proved in a variety of ways, the proofs have 
not been intrinsically three dimensional - (4.11) has not previously been derived 
from a manifestly invariant three dimensional framework. This is the novelty of 
the present discussion. 

Change of Framing. We want to compute a, P, and y, but as a prelude we must 
discuss a certain technical point. At the end of Sect. 2, we learned that choosing a 
circle C and a representation R is not enough to give a well defined quantum 

holonomy operator = Trjj Pexp J Adx. It is also necessary to pick a 

c 

“framing” of the circle C, which enters when one has to calculate the self-linking 
number of C and its non-abelian and quantum generalizations. At the end of 
Sect. 2, we promised to derive a formula (2.33) showing how any partition 
function with an insertion of (C) transforms under a change of framing. Now 
it is time to deliver on this promise. 

As in Fig. 7 b, let us cut the three manifold M on a Riemann surface F that 
intersects C in a point P (and perhaps in some other points that will not be 
material). In our previous argument, we used the fact that associated with the 
boundaries or Mr are Hilbert spaces and .^r. Moreover, j^r (for 
example) is “a flat bundle on moduli space” so the mapping class group of the 
boundary L acts naturally on ^r . We wish to act on the boundary of Mr with a 
very particular diffeomorphism before gluing the pieces of Fig. 7 b back together 
again. The diffeomorphism that we want to pick is a t-fold “Dehn twist” about the 
point P on I. Making this diffeomorphism and then gluing the pieces of Fig. 7 b 
back together again, one gets an identical looking picture, but the framing of the 
circle C has been shifted by t units. On the other hand, one knows in conformal 
field theory how the Dehn twist acts on ^r . Associated with the representation R 
is a number , the “conformal weight of the primary field in the R represent- 
ation”. The t-fold Dehn twist acts on as multiplication by So we have 

obtained (2.33) with h = hg. 

Explicit Evaluation. We will now determine the parameters a, jff, and y that appear 
in the crucial equation (4. 1 0). We need to determine the explicit relation among the 
three vectors y/, y/^ and y /2 that appear in Fig. 7c. This requires a further study of 
the two dimensional Hilbert space which arises as the space of conformal blocks 
for the R, R, R, R four point function on [R being in this case the defining N 
dimensional representation of SU (N) and R its dual]. The three configurations in 
Fig. 7 c can be regarded as differing from each other by a certain diffeomorphism 
of S^; the diffeomorphism in question is the “half-monodromy” under which the 
two copies of R change places by taking a half-step around one another, as 
indicated in Fig. 11. Moore and Seiberg call this operation B and study it 
extensively. The states y/^ and yj^ are none other than 

yj^=By/, y /2 = B^y/. (4.14) 

The matrix B, since it acts in a two dimensional space, obeys a characteristic 
equation 


B^ —yB + z = 0, 


(4.15) 



Quantum Field Theory and the Jones Polynomial 


263 



Fig. 11a and b. The half-monodromy operation exchanging two equivalent points on is 
sketched in a; the arrows are meant to suggest a process in which the first two points change places 
by executing a half-twist about one another. The idea in b is that if the two points on the left in the 
first picture undergo a half-twist about one another, the first picture becomes the second, and if 
this is done again, the second picture becomes the third. In this way the three pictures on the right 
of Fig. 7c differ by a succession of half-monodromies 


y = TrB, z = det5. (4.16) 

In view of (4.14), the linear relation among i//, y/^, and if /2 is (up to an irrelevant 

common factor) just „ 

z-y/-yii/,+xi/2 = (i. (4.17) 


and according to (4.16), to make this explicit we need only to know the eigenvalues 
(and thus the determinant and trace) of B. 

These can be obtained from [13], but before describing the formulas, I would 
like to point out an important subtlety. As we have discussed in the last subsection, 
all concrete results such as the values of a, and y depend on the framing of knots. 
The convention that is most natural in working on an arbitrary three manifold is 
not the convention usually used in discussing knots on S^. 

In studying Fig. 7 c, to describe the relative framings, the task is to specify the 
relative framing of the three pictures on the right, since the picture on the left is 
being held fixed. If one just looks at these three pictures and ignores the fact that 
the lines cannot pass through each other, there is an obvious sense in which one 
would like to pick “the same” framing for each picture; for instance, a unit vector 
coming out of the page defines a normal vector field on each link in the picture. 

This is equivalent to the convention of Moore and Seiberg in defining the 
eigenvalues of B, so we can now quote their results. Let be the conformal weight 
of a primary conformal field transforming as R, let Ei be the irreducible 
representations of SU (N) appearing in the decomposition of i? (g) i?, and let be 
the weights of the corresponding primary fields. Then the eigenvalues of B are 

±exp (/7r(2Aj{-/i£.)), (4.18) 

where the + or — sign corresponds to whether Ei appears symmetrically or 
antisymmetrically in (g) i?. If is the iV dimensional representation of SU (N), 
then one finds that the eigenvalues of B are 




^ 2 = — exp 


7z7r(A+l)\ 

[N(N+k)) 


(4.19) 


“ For this representation, = (iV^ - l)/(2iV(iV'-t-k)). In the decomposition of the 

symmetric piece is an irreducible representation with hi;^ = {N^ + N—2)/N{N+k), and the 
antisymmetric piece is an irreducible representation with = {N^ — N — 2)1 N {N+ k) 



264 


E Witten 


It is straightforward to put these formulas in (4.16), (4.17) and thus make our 
previous results completely explicit. 

Before comparing to the knot theory literature, it is necessary to make a 
correction in these results. For a link in S^, there is always a standard framing in 
which the self-intersection number of each component of the link is zero. Values of 
the knot polynomials for knots in are usually quoted without specifying a 
framing; these are the values for the link with standard framing. However, if on the 
right of Fig. 7c we use the “same” framing for each picture, then when the right of 
Fig. 7c is glued to the left, one does not have the canonical framing for each link. If 
the first knot is framed in the standard fashion, then the second is in error by one 
unit and the third by two units. So after using (4. 1 9) to compute a, P, y, we must, if 
we wish to agree with the knot theory literature, multiply P by exp ( — InihjP) and y 
by exp( — 47r//2^). After these corrections, one gets 


a : 


f Ini \ 


P = —exp 


f in{2-N-N^) \ ( i7t(2 + N-N^) \ 

[ NiN+k) NiN+k) )’ 


y — exp 


(2Tii{\ — A^^)\ 

V iv(iv-f-^) ;• 


(4.20) 


If one multiplies a, P, y by an irrelevant common factor 
exp (in (N^ — 2)jN (N -I- k)) and introduces the variable 


( 3 ' = exp(27rz7(iV-)-A:)), (4.21) 

then the skein relation can be written more elegantly as 

Lo + q-^/^L.=Q. (4.22) 


Here L+ , Lq , and L_ [equivalent to L, , and L 2 in (4.1 1)] are standard notation 
for overcrossing, zero crossing, and undercrossing; and for z = 4- , 0, — , we now 
write simply L,-, instead of Z(Lj). Equation (4.22) is correctly normalized to give 
the right answers for knots on with their standard framing, and if one is only 
interested in knots on one can use it without ever thinking about the framings. 
Finally, comparing (4.13) and (4.22), we see that the expectation value of an 
unknotted Wilson line on S^, with its standard framing, is 


<0 = 




(4.23) 


This formula can be subjected to several interesting checks. First of all, the right- 
hand side of (4.23) is positive for all values of the positive integers N and k. This is 
required by reflection positivity of the Chern-Simons gauge theory in three 
dimensions. Second, in the weak coupling limit of ^ 00 , we have <C> V. This 

is easily interpreted; in the weak coupling limit, the fluctuations in the connection 
Ai on 5'^ are irrelevant, and the expectation value of the Wilson line approaches its 
value for 4, = 0, which is the dimension of the representation, or in this case N. 



265 


Quantum Field Theory and the Jones Polynomial 

4.2. Surgery on Links 

We have seen that it is possible to effectively calculate the expectation value of an 
arbitrary link in S^. We would now like to generalize this to computations on an 
arbitrary three manifold. The basic idea is that by the operation of “surgery on 
links” any three manifold can be reduced to 5^, so it is enough to understand how 
the invariants that we are studying transform under surgery. The operation of 
surgery can be described as follows. One begins with a three manifold M and an 
arbitrarily selected embedded circle C. Note that there is, to begin with, no Wilson 
line associated with C; C is simply a mathematical line on which we are going to 
carry out “surgery”. To do so we first thicken C to a “tubular neighborhood”, a 
solid torus centered on C. Removing this solid torus, M is split into two pieces; the 
solid torus is called in Fig. 12b, and the remainder is called One then 
makes a diffeomorphism on the boundary of Mr and glues Mi^ and Mr back 
together to get a new three manifold M. 

It is a not too deep result that every three manifold can be obtained from or 
reduced to (or any other desired three manifold) by repeated surgeries on 
knots. Howeve'r, such a description is far from unique and it is often difficult to use 
a description of a three manifold in terms of surgery to compute the invariants of 
interest. We will now see that the invariants studied in this paper can be effectively 
computed from a surgery presentation. 




Fig. 12a-c. Surgery on a circle C in a three manifold M is carried out by removing a tubular 
neighborhood of C, depicted in a. At this point M has been separated into two pieces, and , 
with a torus E for their boundaries, as sketched in b. Mr is simply a solid torus. Surgery is 
completed by gluing the two pieces back together after making a diffeomorphism of the boundary 
of Mr. At the end of this process, M has been replaced by a new three manifold M. As we will 
eventually see, computations on M are equivalent to computations on M with a physical Wilson 
line where the surgery was made.-as-in-c. The difference between a and c is that in a the circle Cis 
just a locale for surgery, but in c it is a Wilson line 



266 


E Witten 


We study Fig. 12b by the standard arguments. Hilbert spaces and 
canonically dual to one another, are associated with the boundaries of Mj^ and 
Mr . The path integrals on and Mr give vectors ij/ and x in Jfi and , and the 
partition function on M is just the natural pairing (i//, x)- If we act on the boundary 
of Mr with a diffeomorphism before gluing and Mr back together, then x is 
replaced by Kx so iif/,x) is replaced by (ix,Kx)- 

This potentially gives a way to determine how the partition function of the 
quantum field theory transforms under surgery. Upon gaining a suitable 
understanding of Kx, we will be able to reduce calculations on M to calculations 
on M. 


4.3. The Physical Hilbert Space in Genus One 

At this point we need a description of the physical Hilbert space in genus one. A 
beautiful description, perfectly adapted for our needs, appears in the work of 
Verlinde [12]. 

First of all, the loop group LG has at level k finitely many integrable highest 
weight representations. Let t be the number of these. For each such highest weight 
representation of the loop group, the highest weight space is an irreducible 
representation of the finite dimensional group G. In this way there appear t 
distinguished representations of G; we label these as Rq, Ry ... Ri-i, with Rq 
denoting the trivial representation (which is always one of those on this list). 
Verlinde showed that if iJ is a Riemann surface of genus one, then the dimension of 
the physical Hilbert space is t. Moreover, though there is no canonical basis for 

Verlinde showed that every choice of a homology basis for //'(X", Z), 
consisting of two cycles a and b, gives a canonical choice of basis in For our 
purposes, this can be described as follows. Topologically, there are many 
inequivalent ways to identify a torus E as the boundary of a solid torus U. The 
choice of U can be fixed by requiring that the cycle a is contractible in U. This is 
indicated in Fig. 13a. Next, for every i=Q ... t — \ , one defines a state Vi in as 
follows. One places a Wilson line in the /?,• representation in the interior of U, 



Fig. 13 a and b. A Riemann surface Z of genus one is shown in a as the boundary of a solid torus 
U; the indicated a-cycle is contractible in U. In b, a basis of the physical Hilbert space is indicated 
consisting of states obtained by placing a Wilson line, in the i?; representation, in the interior of U, 
parallel to the cycle 3, and performing the path integral to get a vector Uj in 


Quantum Field Theory and the Jones Polynomial 


267 


running in the b direction, and one performs the Feynman path integral in U to 
define a vector y,- in 3^^. The y,- make up the Verlinde basis in It must be 
understood that a Wilson line in the trivial representation is equal to 1, so the 
vector Vq obtained by this definition is the same as the vector x which in the last 
subsection was obtained by a path integral on U with no Wilson lines: 

X=^Vo. (4.24) 

A diffeomorphism K of Z is represented in the Verlinde basis by an explicit 
matrix KJ, defined by the formula 

K-Vi = YKhj. (4.25) 

j 

In the space spanned by the yj, there is a natural inner product, defined by the 
tensor which is one if is the dual of Rj and zero otherwise. We may sometimes 
use this metric to raise and lower indices, letting K^j = 

m 

We can now get a much more concrete description of the behavior of the 
quantum field theory partition function under surgery. In discussing surgery, we 
began with a three manifold Afand a knot C. Cutting out a tubular neighborhood 
of this knot, whose boundary we call Z, we separated M into and Mg ^ , with Mg 
being a solid torus. The path integrals on and Mg gave vectors y/ and;^ in the 
Hilbert spaces and 3^g . As we have just noted, x is the same as Vq, so the 
partition function on Mis (y/, Vq). Now we want to make a diffeomorphism K on 
the boundary of Mg, and then glue together M^^ and Mg to make a new three 
manifold M. The partition function of Mis Z (M) = (y/, Kvq), as we saw in the last 
section. To say that Z (M") is computable from a surgery presentation means that 
the evaluation of this invariant of Mean be reduced to tractable calculation on M. 
We will now show this. From (4.25), we can write 

Z(lit)='ZKoHy/,Vj). (4,26) 

J 

But each term (y/, Vj) has an interpretation in terms of path integrals on M! Indeed, 
it is the very definition of the Vj that Vj differs from Vq just by an insertion of an 
extra Wilson line in the Rj representation at the center of Mg. So just as (yr, Vq) 
represents the original partition function of M, (fX,Vj) represents a modified 
partition function with an extra Wilson line in the Rj representation placed on C. 
So we rewrite (4.26) in the form 

Z(M) = X Ko^ ■ Z (M; Rj ) , (4.27) 

J 

where Z(M; i?j) is the partition function of M with an extra Wilson line in the Rj 
representation included on the circle C (in addition to whatever Wilson lines are 
already present on M). This is indicated in Fig. 12c. To use (4.27), one needs to 
know the matrix K/, which is precisely the matrix by which the diffeomorphism K 
of the torus is represented on the characters of the irreducible level k represen- 
tations of LG; these matrices appear in [37] and have remarkable properties 


The b cycle on the boundary of U gives a framing of this Wilson line 



268 


E Witten 


recently investigated in [12,13]. Given a knowledge oj the Af/, (4.27) is a 
completely explicit formula expressing computations on M in terms of computa- 
tions on M. By repeated use of this formula, computations on any three manifold 
can be reduced to computations on with appropriate Wilson lines. Of 

course, the surgery will generate Wilson lines on 5^ in representations of G 
corresponding to arbitrary integrable representations of LG. 

Generalized Surgery. The surgery law (4.27) has a useful generalization. While so 
far we have only considered surgery on a purely imaginary circle C, as in Fig. 12, 
there is no reason not to generalize this to a situation in which before the surgery a 
Wilson line in the representation was already present on C. Surgery amounts to 
cutting out a neighborhood of C and then gluing it back in, and after this process 
the Ri Wilson line will still be present in M. So the left-hand side of (4.27) is 
replaced by Z (M; Ri), where the notation schematically indicates the presence of 
the Ri Wilson line. What about the right-hand side of (4.27)? Before surgery, with a 
Wilson line Ri on C, the path integral on a tubular neighborhood U of C gives on 
the boundary a state = y,-; this is the generalization of (4.24). If we cut out C and 
glue it back in with a diffeomorphism K of the boundary, then V; is replaced 
according to (4.25) with K/ Vj . If we remember that Vj could have been obtained by 
putting a Wilson line on C in the Rj representation, we see that the right-hand side 

of (4.27) becomes ^ AT/ Z (M; so we get the generalized surgery formula 
j 

ZilSr-, Rj) = S Rj) ■ (4.28) 

j 

This formula will be used later in a new proof of Verlinde’s conjecture. 

Cabling of Knots; Satellites. Finally, let us note that similar methods can be used 
to determine the behavior of the knot invariants under “cabling”, and more 
generally to relate the invariants of the “satellites” of a knot to invariants of the 
original knot. Any knot C in any three manifold Mhas a neighborhood that looks 
like a solid torus U. If we replace C by an arbitrary satellite C of itself (an arbitrary 
knot that can be placed in U), with representations R„ associated with the 
connected components of C, then the path integral on U will define a vector y/ in 
the physical Hilbert space associated with the boundary Z of U. Like any 
vector in y/ can be expanded in the Verlinde basis, 

y/ = Y.^iVi. (4.29) 

i 

The a,- are complex numbers that depend on the choice of satellite C and on the 
choice of representations R„, but they do not depend on what three manifold M 
the solid torus 17 has been extracted from or on what other knots are present on M. 
The vectors ViSJf^ are the vectors that would be produced by the path integral on 
U with a Wilson line in the Ri representation placed on the original knot C (and 
not a satellite of C). Thus, a knowledge of the invariants of C in arbitrary 
representations together with a knowledge of the universal coefficients a,- is 
enough to determine the invariants of arbitrary satellites of C. 

Or on any desired three manifold; we will see that S^xS^ is more tractable than 



Quantum Field Theory and the Jones Polynomial 


269 



Fig. 14a— c. Beginning with Xxl, shown in a, one makes A'x5'* by identifying Xx {0} with 
A'x {!}. If Z is 5^ with some marked points Pi, then this construction gives the picture of b. If 
X {0} is joined toS^ x {1} via a non-trivial diffeomorphism B, one makes in this way a braid, as 
in c 


4.4. Path Integrals on x X 


In this subsection we will describe a few facts which are useful in their own right 
and will enable us to carry out some concrete surgeries. 

First of all, we have not so far determined the partition function of 5^ without 
Wilson lines. It may come as a surprise to topologists that we cannot trivially 
assert that this is 1 . In quantum field theory there is no particularly strong axiom 
governing the partition function of 5'^. The three manifolds whose partition 
functions can be computed in a particularly simple way, from the axioms of 
quantum field theory, are those of the form ZxS'S for various X. XxS^ can 
conveniently be studied in a “Hamiltonian” formalism, as indicated in Fig. 14a. 
One constructs the Hilbert space of X. Then one introduces a “time” direction, 
represented by a unit interval I = [0, 1], and one propagates the vectors in Jifx from 
“time” 0 to “time” 1 . This operation is trivial, since the Chem-Simons theory, hke 
any generally covariant theory, has a vanishing Hamiltonian. Finally, one forms 
Z X >S by gluing X x {0} to A" x { 1 } ; this identifies the initial and final states, giving 

3. tr3CC* 

Z{XxS^) = TTy,^{\) = 6mt34Px. (4.30) 


For example, the physical Hilbert space of is one dimensional, for any G and k, 

so one has 

Z(52x5^)=1. (4.31) 


It is possible to generalize (4.30) as follows. If we are given a diffeomorphism 
K\ X-*X, then one can form the mapping cylinder Af x ^ ^ by identifying x x { 1 } 
with K{x) X {0} for every xeX. At the level of quantum field theory, when one 
goes from Afx / to AT x ^ 5 S the initial and final states are identified via K, so the 
generalization of (4.30) is 

Z(Arx^5^) = Tr^,(A:). (4.32) 


The situation that we actually wish to apply this to is the case in which Af is 
with some marked points P^^,a = \ ... 5 to which representations R^a) are assigned. 




270 


E Witten 


[For a=i ... s,i{a)is one of the values 0 ... t - 1 corresponding to integrable level 
k representations of the loop group.] In this case, the simple product JIT x 5^ is just 
S^xS^ with some Wilson lines which are unknotted, parallel circles of the form 
(PJ X S\ as sketched in Fig. 14b. To determine the path integral on 5*^ x in 
the presence of these Wilson lines, which we will denote as .Z(>S^ x 5"^; (^Ry), one 
needs to study the Hilbert space of with charges in the representations we 
will denote this as The analog of (4.31) is then 

Z{S^xS^; {Ry) = dim (4.33) 

The dimensions ofthese spaces were discussed at the end of Sect. 3. Thus, if the 
collection of representations <P> consists of a single representation we get 

Z(S^xS^; = (4.34) 

since the physical Hilbert space with a single charge in the R„ representation is one 
dimensional if R^ is the trivial representation (<2 = 0) and zero dimensional 
otherwise. For two charges in the representations R^ and we get 

Z(5"xS^;P„P,) = g,,, (4.35) 

where g^b, introduced earlier, is 1 if Rb is the dual of Rg and zero otherwise. The 
formula (4.35) follows from the result of Sect. (4.4) for the Hilbert space on 
with two charges. Finally, if there are three charges in the representations Rg, Rb^ 
R„ we get 

Z(S^xS^;Rg,Rb,Rg) = Ngbg, (4.36) 

with Ngbc the trilinear “coupling” of Verlinde, since this is the dimension of the 
physical Hilbert space. 

4.5. Some Concrete Surgeries 

Now we would like to describe some useful results that can be obtained from 
concrete surgeries. The first goal is compute the partition function of 5^. Since we 
already know the partition function of S^ x S^, we will try to interpret 5^ as a 
manifold obtained by surgery on S^xS^. This is readily done. We consider the 
circle CinS^ x S^ indicated in Fig. 15a. A tubular neighborhood of C is a torus Z; 
we pick a basis of (Z; Z) consisting of cycles a and b indicated in the figure. 
Now we wish to make a particular surgery associated with a very special 
diffeomorphism S: Z Z. We pick S to map atob and bto —a.^^ This surgery - 
removing the interior of Z from S^x S^ and gluing it back after acting with S - 
produces a three manifold that is none other than S^ (Fig. 16). Since this point is 
crucial in what follows, we pause to explain it. We regard S^ as R^ plus a point at 
infinity. In Fig. 16a a torus Z has been embedded in R^. Obviously, Z with its 


Implicit in (4.34) and subsequent formulas is the Use of the standard framing of the Wilson line 
which is invariant under rotations of Sh it is for this choice that the path integral on x 
computes the trace of the identity operator in the physical Hilbert space 
^^.This transformation, which acts on the upper half plane as t — 1 /t, is indeed usually called S 

in the theory of the modular group SL (2, Z) (which can be identified, via the basis a, b, with the 
mapping class group of 2') 



Quantum Field Theory and the Jones Polynomial 


271 



Fig. 15a c. In part a, we consider surgery on a circle C in x S^. A tubular neighborhood of 
this circle is a torus £; a useful basis of (F, Z) is indicated. In b, in addition to the circle C on 
which we perform surgery, there is a parallel circle C on which we place a Wilson line in the Rj 
representation. In c, there are two parallel circles C and C" with Wilson lines in representations 
Rj and 



Fig. 16 a and b. The purpose of this figure is to indicate how can be made by surgery starting 
with 5^ X 5^ In a we show a torus I, sitting in and in b a pair of identical solid tori 


interior make up a solid torus T. It is also relatively easy to see that the figure (a) is 
invariant under inversion, so that the exterior of £ (including the point at infinity) 
is a second solid torus T'. Thus, can be made by gluing two solid tori along their 
boundaries. Now in (b) we sketch two identical solid tori T and T'; T' has been 
obtained by simply translating T in Euclidean space. If one glues together the 
boundaries pointwise with the identification that is indicated by saying that “T' is 
a translate of 7”’, one gets x S^. (In fact, the solid torus Tis Z> x S^, with D a 
two dimensional disc, and T' is D' x with D' a second disc. Just as two discs D 
and D' glued on their boundary make S^, DxS^ naturally glues to D'xS^ to 
make S^xS^.)On the other hand, we know from part (a) that can be obtained 
by gluing two solid tori. A little mental gymnastics, comparing the argument we 
gave in connection with (a) to that in (b), shows that to make S ^ we must glue 
together T and T' after making the modular transformation S on the boundary 
of r. 

Now we can use (4.27), with playing the role of M, S^xS^ playing the role 

of M, and the arbitrary diffeomorphism K replaced by S. So we learn 

Z(S'3) = XV^('S'"x5';i?;). (4.37) 

J 

We have learned in the last section that Z{S^xS^; Rj) is 1 for 7 = 0 and 0 
otherwise, so 


Z{S^) = So,o- 


(4.38) 


272 


E Witten 


Here Sq q can be determined from the theory of affine Lie algebras; for G = SU(2) 
one gets the formula stated earlier in (2.26). In fact, the whole matrix S^j can be 
written very explicitly for G = SU (2). The integrable representations of level k are 
those of spin n/2 for « = 0 . . . k, and the matrix elements of S are 


^mn 


k + 2 


(m + l)(n H- 1)71 

F+2 


(4.39) 


The Phase of the Partition Function. Now let us re-examine in the light of these 
methods a thorny question that appeared in Sect. 2 - the framing of three 
manifolds, and the phase of the partition function. 

We have obtained from S^y.S\hy performing surgery on a certain circle 
C, using the modular transformation S: t — 1/t. Apart from S, there are other 
modular transformations that could be used to build by surgery on the same 
knot C in S^xS^. The general choice would be T" ST"', with n and m being 
arbitrary integers, and Tbeing the modular transformation T: t t -t- 1 . [5 and T 
are the standard generators of the modular group, obeying = {STf = 1 .] Had 
we used T"ST”', we would have gotten not (4.38) but 

Z(S^) = (T"ST"')o,o. (4.40) 

This may readily be evaluated. In the Verlinde basis, 7 is a diagonal matrix with 
T- Vi = ■ Vi\ hi is the conformal weight of the primary field in the 

representation /?,• and c is the central charge for current algebra with symmetry 
group G at level k. Since /iq = 0, if we replace (4.38) by (4.40) the partition function 
transforms as 

Z-^Z ■ exp 


2ni{n — m) 


(4.41) 


Though we have obtained this formula in the example of a particular surgery 
(giving from S^ xS^), the same ambiguity arises in any process of surgery. 
Whenever one makes surgery on a circle C, in a three manifold M, with the surgery 
being determined by an SL (2, Z) element u, one could instead consider surgery on 
the same circle C, using the SL (2, Z) element u ■ T". This would have the same 
effect topologically, but our surgery law would give a partition function 
containing an extra phase exp (—271 /m • c/24). 

This phase ambiguity was already encountered, in the large k limit, in formula 
(2.24). What is more, from the discussion in Sect. 2, we know what topological 
structure on three manifolds must be considered in order to keep track of the 
factors of exp(27rf • c/24). One must consider “framed” three manifolds. Two 
surgeries that have the same effect on the topology of a three manifold may have 
different effects on the framing. I will discuss elsewhere how to systematically keep 
track of the factors of exp {2%i • c/24) under surgery. In the simple applications in 
this paper, this will not be necessary. All of our applications will involve 
considering the standard surgery (by the modular transformation S) that was used 
in the last subsection to obtain S^ from S^xS^. 


Some Expectation Values. Now let us see if we can go farther and determine the 
path integral Z (S^; Rj) on S^ with an unknotted Wilson line on S^ in an arbitrary 



Quantum Field Theory and the Jones Polynomial 


273 


representation To do this, we start on S^xS^ with a Wilson line in the Rj 
representation running parallel to the circle C on which we are doing surgery, as in 
Fig. 15b. Carrying out the same surgery as before turns x S'- into S\ with a 
Wilson line in the Rj representation on Application of (4.27) now gives 

Z (S^; Rj) = ^ X 5"; R^, Rj) . (4.42) 

i 

Using (4.35), we can evaluate this and determine the partition function for a 
Wilson line in an arbitrary representation Rj; it is 

Z(S^Rj)==Y.So‘gij = Soj. (4.43) 

i 


Let us compare this to our previous evaluation (4.23) of the expectation value of 
an unknotted Wilson line in S^. We must recall that the symbol <C> in (4.23) 
represented a ratio <C> = Z(S^;R)IZ(S^). Let us take G = SU(2), so that we 
can use the explicit formulas (4.39), and take R to be the two dimensional 
representation of SU (2), so that we can compare to (4.23). Using (4.38). (4.43), 
and (4.39), we get 

/C\ _ sin( 27 c/(A: 4 - 2 )) 


^ 0,0 


sin {n/ik + 2)) 


(4.44) 


It is easy to see that setting = 2 in (4.23) gives the same formula. Let us take this 
one step further and try to calculate by these methods the partition function 
Z(S^; Rj, R^) for with two unknotted, unlinked Wilson lines in represen- 
tations Rj and 7?^. In Fig. 15c, we start on S^xS'^ with two Wilson lines, in 

representations Rj and R^, parallel to the circle C on which surgery is to be 
performed. Carrying out the surgery, we get to with the desired unlinked, 
unknotted circles. In this case, the surgery formula (4.27) tells us that 

Z(S^; Rj, R,) = ^Si,Z(S^xS^; R„Rj, R,). (4.45) 

i 

The right-hand side can be evaluated with (4.36), while the left-hand side can be 
reduced to (4.43) using (4.6). We get 

= (4.46) 

*^ 0,0 i 


Proof of Verlinde’s Conjecture. The last equation is a special case of a celebrated 
conjecture by Verlinde, which has been proved by Moore and Seibergfl 3]. We can 
use these methods to give a new proof of Verlinde’s conjecture, in the case of 
current algebra. We will have to use the generalized surgery relation (4.28). We 
return to Fig. 1 5 b but now instead of treating C as a purely imaginary contour on 
which surgery is to be performed, we suppose that there is a Wilson line on C in the 
Ri representation. In this case, the standard surgery on C will still turn S^xS' into 
but now on we will have two Wilson lines, in the R^ and Rj representations. 
Some mental gymnastics shows that they are linked, as in Fig. 17 a; schematically. 


We give this Wilson line the framing described in the footnote after (4.34); after surgery this 
turns into the standard framing on 



274 


E Witten 




Fig. 17a-c. In a, we sketch linked but unknotted Wilson lines on 5^, in the /?,• and Rj 
representations. In b, a Wilson line Ri is linked with two Wilson lines Rj and on S^. In c, we 
sketch how two crucial amplitudes can be factored through the same one dimensional space 


we refer to this linked pair of Wilson lines as L (7?;; Rj). The use of (4.28) therefore 
determines the partition function of with a pair of linked Wilson lines; 

Z(S^ L(R,-Rj)) = £S,‘Z(S2 xS‘; R*, Rj) = Sy. (4.47) 

k 

In the second step, we have used the fact that the partition function of S'- with 

static charges and Rj is the metric that we have called g^j - one if R^ is dual to Rj 
and otherwise zero. 

Now let us go back to Fig. 15 c, and again on what was previously the purely 
imaginary circle C we put a Wilson line in the 7?; representation. The standard 
surgery on this link will now produce a picture sketched in Fig. 17 b, with a Wilson 
line on in the R^ representation that links a pair of Wilson lines Rj, R^ that are 
themselves unlinked and unknotted. We call this configuration L (7?,-; Rj, 7?*). The 
evaluation of (4.28) now gives 

Z(S";I,(R,; Rj, R*)) = ISrZ(S" xS'; R„,Rj,R,) = (4.48) 

wi m 

To obtain Verlinde’s formula, it is now necessary to find an independent way to 
evaluate the left-hand side. 

Such a method is provided by the following generalization of the multiplica- 
tivity formula (4.1). The key point in the derivation of (4.1) was that the physi- 
cal Hilbert space for with no charges was one dimensional. It is likewise true 
that the physical Hilbert space for with a pair of charges in the dual 



Quantum Field Theory and the Jones Polynomial 


275 


representations 72,- and 7?^ is one dimensional. Using this and otherwise repeating 
the derivation of (4.1) gives the following formula: 

Z(S^-,LiRrRj,R,)) • Z{S\Rd = Z{S\L{R,-, Rj)) ■ Z{S^-L{Rc,R,)). (4.49) 

The idea in (4.49) is that, as in Fig. 1 7 c, the evaluation of Z (5^; L (7?,-; Rj, R^)) can 
be expressed as a pairing (y/, x) where y/ and^ are certain vectors in and its dual. 
Likewise the evaluation of Z (5”^; 7?^) is a pairing {v', v), where v' and v are vectors 
in and its dual. Using the wonderful fact of one dimensional linear algebra 
iw^x) ■ (v',v) = (y/,v) • (v',x)i we arrive at (4.49). Since all factors in (4.49) are 
known except the first, we arrive at the result 

Z(S^L(Rf,Rj,R,)) = S,jS,JSoj 
Combining this with (4.48), we have 

m 

This is equivalent to Verlinde’s statement that “the matrix S diagonalizes the 
fusion rules”. In other words, in the basis y,- indicated in Fig. 13, the structure 
constants of the Verlinde algebra are by definition v^Vj = where 

k 

Nij^ = ^ Nip If we introduce a new basis W; = -Sq, ,• • ^ 5','" v „ , then the V erlinde 

r 

algebra reduces to WiWj= d^Wj. To verify this, we compute 

w,Wj = l^S,‘S!v,vr S^,,Soj. (4.52) 

kj 

Using Vic^i = Slid (4.51), this becomes 

m 

WiWj = Slv”'- SiiSr- Soj. (4.53) 

Using the unitarity of S, in the form S/ Sn = 3ij, we see that 

w. wj = SijY, Sp y„ • Sqj = dij ■ Wj , (4.54) 

m 

showing that the Verlinde algebra has been diagonalized and that the Wf are 
idempotents. 

Connected Sum of Links. At the beginning of Sect. 4, we have seen that the 
quantum partition function has a multiplicative behavior under connected sum 
of three manifolds. From (4.1), if M=Mi-\-M 2 , then Z{M) • Z(S^) 
= Z(Mi) ■ ZiMf). In the special case that Mj and M 2 are copies of with links 
in them, the connected sum of and M 2 is a copy of containing the 
disconnected sum of the two links. 

In knot theory there is also an operation of taking the connected sum of two 
links. This operation has appeared in the above discussion. The link that we have 
called L{Ri,Rj, Rf) is the connected sum of the two links that we have called 
L {Ri , Rfj and L {Ri, R^). In fact, in Fig. 1 7c, L (T?.-; Rj, 72^) is “cut” into two pieces, 
which are respectively L(Ri,Rj) and L(72i,72fc) with in each case a connected 
segment removed. This is the defining configuration for the “connected sum of 
links”. Accordingly, the reasoning that led to (4.49) has the following more 


(4.50) 

(4.51) 



276 


E Witten 


general consequence. If Li and L 2 are two links, and Li + L 2 is their connected 
sum, then 

Z(S^;L,+L2) • Z(S^;C) = Z(S^;L,) ■ Z(5^;L,). (4.55) 

Here it is understood that [as in (4.49)] representations have been assigned to the 
connected components of L^, L 2 , £iiid + Z /2 ^ compatible fashion; C is an 

unknot placed in whatever representation is carried by the strand “cut” in the 
generalization of Fig. 17c. Equation (4.55) has a generalization in which and 
L 2 are links in arbitrary three manifolds Mj and M 2 ; then the connected sum of 
links Li + L 2 is a link in the connected sum of manifolds M = Mj + Mj, and 
(4.55) is replaced by 

Z(Mi+M2; L 1 +L 2 ) • Z(S^C) = Z(Mi;Li) • Z(M2;L2). (4.56) 


4.6. The Knot Polynomials and the Braid Group 

The results in the last subsection are nice enough so that one may wonder if the 
partition function for an arbitrary link on can be evaluated in this way. This 
can indeed be done, and in a way that is closely related to the original route by 
which the Jones polynomial was discovered, though we cannot expect such explicit 
formulas as in the simple cases treated above. An arbitrary link L on S^, whose 
partition function we will call Z(S^; L), can be arranged in the form of a braid, as 
indicated in Fig. 18a. One can imagine “lifting” this braid B out of and putting 
it on X 5^ To get back to one would have to do surgery on a circle running 
parallel to the braid, as suggested in Fig. 18b. The general surgery formula (4.27) 
then tells us 

Z(53;L) = XV^(‘5'X‘^'; Rp (4-57) 

j 

where Z(5'^ x S^\ Rj, B) is the partition function on xS^ in the presence of 
both the braid B and a parallel Wilson line in the Rj representation. We want to 
rewrite this in the spirit of (4.32). Suppose that the braid B contains n strands 
making up a collection of representations <7?>. Then 5 can be regarded as defining 
an element of the Artin braid group on n letters. The braid group is closely related 
to the mapping class group for with marked points. The reason for that is that if 
in Fig. 18b we “cut” 5"^ x to get back to x / (this amounts to undoing what 
was done in Fig. 14, then the braid can be unbraided. Thus, the complete 
information about the braid is in the choice of a diffeomorphism of 
(constrained to preserve the marking of the points) by which the top and bottom of 
Fig, 18c are to be identified. This, however, does not quite mean that the braid 
group is the same as the mapping class group. In Fig. 1 8 b, there are n + 1 strands, 
one of which arose from the surgery and does not participate in the braid, while the 
other n strands make up the braid. The braid group on n letters is the subgroup of 
the mapping class group on « + 1 letters which fixes one of the (framed) strands. 
There are a number of invariant traces on the braid group that can be naturally 
defined with the data at our disposal. They are 

XiiB) = Z(S^xS^;R,,B). 


(4.58) 



Quantum Field Theory and the Jones Polynomial 


277 





Fig. 18a-d. Any link on can be shaped into the form of a braid, as in a; putting the same braid 
on S^xS^, and doing surgery on the circle C indicated with the dotted line in part b, one gets back 
to the original link on S^. If one “cuts” S^x to get to x I, the braid can be “unbraided”; the 
braid is recovered by prescribing a diffeomorphism of S^, via which the top and bottom of part c 
are to be identified. In d, we sketch the origin of the key property of the braid traces. In the 
presence of an arbitrary Wilson line on the dotted contour (reflecting the results of surgery), two 
braids and B2 are joined end to end in xS^. There is no way to tell which comes first, so the 
partition function is invariant under exchange of Bi and B2 


This has the key property of a trace, 

(4.59) 

since the two sides of (4.59) have the same path integral representation, in which 
the two braids are glued end to end in x as in Fig. 18d. Not only does (4.58) 
obey (4.59); it is actually equal to the trace of the operator 5 in a certain 
representation of the braid group, namely the representation furnished by the 
physical Hilbert space for with the n + 1 charges, it being understood that 
the braid group is acting on the first n charges, and the (n + 1)*' is fixed in the Ri 
representation. That Ti(B) is the trace of B in this Hilbert space is a statement just 
along the lines of (4.32). 

So we can rewrite (4.57) in the form 

Z(S^-,L) = Y,So‘tj<.B). (4.60) 

j 

This shows that the link invariants on may be written as linear combinations of 
braid traces. This is very close to how the knot polynomials were originally 




278 


E Witten 


discovered. It is clear from the work of Tsuchiya and Kanie [11] and Segal [16] 
along with what has been said above that the braid traces that arise from the 
Chern-Simons theory are precisely those that first appeared in the work of Jones. 


5. Applications To Physics 

Finally, I would like to comment on the likely implications of these results for 
physics. We have been exploring a three dimensional viewpoint about conformal 
field theory, at least for the important special case of current algebra on Riemann 
surfacps. Many aspects of rational conformal field theory have emerged as natural 
consequences of general covariance in three dimensions. It seems likely that the 
marvelous hexagons and pentagons of [13], and the other consistency conditions 
of rational conformal field theories, can be synthesized by saying that such 
theories come from generally covariant theories in three dimensions. If so, general 
covariance in three dimensions may well emerge as one of the main unifying 
themes governing two dimensional conformal field theory. Such considerations 
have motivated a study of 2 + 1 dimensional gravity which will appear elsewhere 
[38]. 

The basic connection that we have so far stated between general covariance in 
2 + 1 dimensions and conformally invariant theories in 1 + 1 dimensions is that 
the physical Hilbert spaces obtained by quantization in 2 + 1 dimensions can be 
interpreted as the spaces of conformal blocks in 1 + 1 dimensions. This connection 
may seem rather abstract, and I will now make a few remarks aimed at making a 
more concrete connection. Starting from three dimensions we were led to the 
problem of quantizing the Lagrangian (3.1), which we repeat for convenience: 

( 5 . 1 ) 

This is to be quantized with constraints e'-' Ffj = 0, these constraints being the 
generators of the infinitesimal gauge transformations 

(5.2) 

So far we have quantized (5.1) on Riemann surfaces without boundary, but now 
let us relax this requirement. Quantization of (5.2) in this more general case 
amounts to studying the three dimensional Chern-Simons theory on a three 
manifold with boundary, namely Ey^R^, where 27 is a Riemann surface with 
boundary. For instance, let 27 be a disc D. To quantize (5.1) on the disc, we must 
impose the constraints, which generate the gauge transformations (5.2). As D has 
a boundary, we must choose boundary conditions on Ai and e (for closed surfaces 
this question did not arise). We will adopt free boundary conditions for A^, but 
require a = 0 on the boundary of 5. (A rationale for this choice is that the Chern- 
Simons action is not invariant under gauge transformations that do not vanish on 
the boundary.) With this condition, s generates in (5.2) the groups of gauge 
transforma.tions which are the identity on the boundary of the disc. An element of 
this group is an arbitrary continuous map V\ D G whose restriction to S is the 



Qusjitum Field Theory and the Jones Polynomial 


279 


identity. Now we impose the constraints. The first step is to require that = 0. 
Since the disc is simply connected, this implies that = — 5; T' • U~^ for some 
map U: D G. t/ is uniquely defined up to 

U~> U-W, (5.3) 

for a constant element WeG. Then we must identify two U's that differ by an 
element of the restricted gauge group Gj . This means that we must impose the 
equivalence relation Uc^VU for any V such that F = 1 on the boundary S. The 
equivalence relation means that only the restriction of 17 to G is relevant. This 
restriction defines an element of the loop group LG, but because of the freedom 
(5.3), we should actually regard U as an element of LGjG. Geometrically, we have 
learned that the homogeneous space LGjG can be regarded as the symplectic 
quotient of the space of G connections on the disc D, by the group Gj of symplectic 
diffeomorphisms. Now we wish to quantize the theory, which means doing 
quantum mechanics on LGjG, which inherits a natural symplectic structure from 
(5.1). Clearly, the group LG of gauge transformations on the boundary of E acts 
on LGjG, so the quantum Hilbert space will be at least a projective representation 
of the loop group. In fact, according to Segal [39], the quantization of LGjG, with 
this symplectic structure, gives rise to the basic irreducible highest weight 
representation of the loop group. This makes the connection between 2 + 1 dimen- 
sions and 1 + 1 dimensions far more direct, since the irreducible representations 
of the loop group are a basic ingredient in the 1 + 1 dimensional theory. It is 
obvious at this point that by considering more complicated Riemann surfaces with 
various boundary components, we can generate the whole 1 + 1 dimensional 
conformal field theory, essentially by studying the generally covariant 2 + 1 
dimensional theory on various three manifolds with boundary. 


Acknowledgements. This work originated with the realization that some results about conformal 
field theory described by G. Segal could be given a three dimensional interpretation by 
considering a gauge theory with Chern-Simons action. I am grateful to Segal for explaining his 
results, and to M. Atiyah for interesting me in and educating me about the Jones polynomial. 
V.F.R. Jones and L. Kauffman, and other participants at the lAMP Congress, raised many 
relevant questions. Finally, I must thank S. Deser and D.J. Gross for pointing out Polyakov’s 
paper, G. Moore and N. Seiberg for explanations of their work, and the organizers of the lAMP 
Congress for their hospitality. 


References 


1. Atiyah, M.F.: New invariants of three and four dimensional manifolds. In: The mathemati- 
cal heritage of Hermann Weyl. Proc. Symp. Pure Math., vol. 48. Wells R. (ed.). Providence, 
RI: American Mathematical Society 1988, pp. 285-299 

2. Donaldson, S.: An application of gauge theory to the topology of four manifolds. J. Diff. 
Geom. 18, 269 (1983), Polynomial invariants for smooth four-manifolds. Oxford preprint 

3. Floer, A.: An instanton invariant for three manifolds. Courant Institute preprint (1987). 
Morse theory for fixed points of symplectic diffeomorphisms. Bull. AMS 16, 279 (1987) 

4. Witten, E.: Topological quantum field theory. Commun. Math. Phys. 117, 353 (1988) 



280 


E Witten 


5. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1 (1983). A polynomial invariant for 
links via von Neumann algebras. Bull. AMS 12, 103 (1985), Hecke algebra representations of 
braid groups and link polynomials. Ann. Math. 126, 335 (1987) 

6. Freyd, P., Yetter, D., Hoste, J., Lickerish, W.B.R., Millett, K., Ocneanu, A.: A new 
polynomial invariant of knots and links. Bull. AMS 12, 239 (1985) 

7. Kauffman, L.: State models and the Jones polynomial. Topology 26, 395 (1987). Statistical 
mechanises and the Jones polynomial, to appear in the Proceedings of the July, 1986, 
conference on Artin’s braid group, Santa Cruz, California; An invariant of regular isotopy 
preprint 

8. Turaev, V.G.: The Yang-Baxter equation and invariants of links. LOMI preprint E-3-87, 
Inv. Math. 92, 527 (1988) 

9. Przytycki, J.H., Traezyk, P.: Invariants of links of conway type. Kobe J. Math., 4, 1 15 (1988) 

10. Birman, J.: On the Jones polynomial of closed 3-braids. Invent. Math. 81, 287 (1985). 
Birman, J., Wenzl, H. Link polynomials and a new algebra, preprint 

11. Tsuchiya, A., Kanie, Y.: In: Conformal field theory and solvable lattice models. Adv. Stud. 
Pure math. 16, 297 (1988); Lett. Math. Phys. 13, 303 (1987) 

12. Verlinde, E.: Fusion rules and modular transformations in 2d conformal field theory. Nucl. 
Phys. B300, 360 (1988) 

13. Moore, G., Seiberg, N.: Polynomial equations for rational conformal field theories. To 
appear in Phys. Lett. B, Naturality in conformal field theory. To appear in Nucl. Phys. B, 
Classical and quantum conformal field theory. IAS preprint HEP-88/35 

14. Schroer, B.: Nucl. Phys. 295, 4 (1988) K.-H. Rehren, Schroer, B.: Einstein causality and Artin 
braids. FU preprint (1987) 

15. Frohlich, J.: Statistics of fields, the Yang-Baxter equation, and the theory of links and knots. 
1987 Cargese lectures, to appear In Nonperturbative quantum field theory. New York: 
Plenum Press 

16. Segal, G.: Conformal field theory. Oxford preprint; and lecture at the lAMP Congress, 
Swansea, July, 1988 

17. Gromov, M.: Pseudo-holomorphic curves in symplectic manifolds. Invent. Math. 82, 307 
(1985) 

18. Schwarz, A.: The partition function of degenerate quadratic functional and Ray-Singer 
invariants. Lett. Math. Phys. 2, 247 (1978) 

19. Schonfeld, J.: A mass term for three dimensional gauge fields. Nucl. Phys. B 185, 157 (1981) 

20. Jackiw, R., Templeton, S.: How superrenormalizable theories cure their infrared divergences. 
Phys. Rev. D23, 2291 (1981) 

21. Deser, S., Jackiw, R., Templeton, S.: Three dimensional massive gauge theories. Phys. Rev. 
Lett. 48, 975 (1983). Topologically massive gauge theory. Ann. Phys. NY 140, 372 (1984) 

22. Zuckerman, G.: Action principles and global geometry. In: The proceedings of the 1986 San 
Diego Summer Workshop, Yau, S.-T. (ed.) 

23. Polyakov, A. M.: Fermi-bose transmutations induced by gauge fields. Mod. Phys. Lett. A3, 
325 (1988) 

24. Hagen, C.R.: Ann. Phys. 157, 342 (1984) 

25. Arovas, D., Schrieffer, R., Wilczek, F., Zee, A.: Statistical mechanics of anyons. Nucl. Phys. 
B251[FS13], 117 (1985) 

26. Witten, E.: Non-Abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455 
(1984) 

27. Ray, D., Singer, I.:. Adv. Math. 7, 145 (1971), Ann. Math. 98 154 (1973) 

28. Deser, S., Jackiw, R., Templeton, S.: In [21]; Affleck, L, Harvey, J., Witten, E.: Nucl. Phys. 
B 206, 413 (1982) 

Redlich, A.N.: Gauge invariance and parity conservation of three-dimensional fermions. 
Phys. Rev. Lett. 52, 18 (1984) 

Alvarez-Gaume, L., Witten, E.: Gravitational anomalies. Nucl. Phys. B234, 269 (1983) 
Alvarez-Gaume, Della Pietra, S., Moore, G.: Anomalies and odd dimensions. Ann. Phys. 
(NY) 163, 288 (1985) 



Quantum FieW Theory and the Jones Polynomial 


281 


Atiyah, M.F.: A note on the eta invariant (unpublished) 

Singer, I.M.: Families of Dirac operators with apphcations to Physics. Asterisque, 1985, 
p.323 

29. Atiyah, M.F., Patodi, V., Singer, I.; Math. Proc. Camb. Phil. Soc. 77, 43 (1975), 78, 405 
(1975), 79, 71 (1976) 

30. Wilczek, F., Zee, A.: Linking numbers, spin, and statistics of solitons. Phys. Rev. Lett. 51, 
2250 (1983) 

31. Friedan, D., Shenker, S.: Nucl. Phys. B281, 509 (1987) 

32. Belavin, A., Polyakov, A.M., Zamolodchikov, A.: Nucl. Phys. B (1984) 

33. Atiyah, M.F., Bott, R.; The Yang-Mills equations over Riemann surfaces. Phil. Trans. R. 
Soc. Lond. A 308, 523 (1982) 

34. Quillen, D.: Determinants of Cauchy-Riemann operators over a Riemann surface. Funct. 
Anal. Appl. 19, 31 (1986) 

35. Drinfeld, V.: Quantum groups. In: The Proceedings of the International Congress of 
Mathematicians, Berkeley 1986, Vol. 1, pp. 798-820 

36. Gepner, D., Witten, E.: String theory on group manifolds. Nucl. Phys. B 278, 493 (1986) 

37. Kac, V. G.; Infinite dimensional Lie algebras. Cambridge: Cambridge University Press (1985) 
Kac, V.G., Peterson, D.H.: Adv. Math. 53, 125 (1984) 

Kac, V.G., Wakimoto, M.: Adv. Math. 70, 156 (1988) 

38. Witten, E.: 2+1 dimensional gravity as an exactly soluble system. IAS preprint HEP-88/32 

39. Segal, G.: Unitary representation of some infinite dimensional groups. Commun. Math. 
Phys. 80, 301 (1981) 

Communicated by A. Jaffe 

Received September 12, 1988; in revised form September 27, 1988 



10. Chiral Anomalies In Field Theories 


H. Banerjee * 

S. N. Bose National Centre for Basic Sciences, 
Salt Lake, Calcutta > 700 091, India 


Abstract 

The role of the contribution from the fermion mass term in the axial vector Ward identity 
in generating the U(l) axial anomaly, both local and global, is elucidated. Gauge invariance 
requires the fermion to decouple from the gauge field if it is very heavy. This identifies the 
Adler-Bell-Jackiw (ABJ) anomaly with the asymptotic limit of the sign reversed mass term. 
In an instanton background, the chiral limit (m = 0) of the mass term does not vanish but 
consists of contributions from fermion zero modes. Space time integral of these zero mode 
contributions exactly cancels, thanks to the Atiyah-Singer index theorem, the integral of the 
ABJ anomaly and suggests that the Jacobian for global U(l) chiral transformation is trivial 
even in an instanton background. This can be realised in the representation of the fermion 
partition function in a Weyl basis. The resolution of the strong CP problem is thus achieved 
in an axionless physical world. 

In chiral gauge theories the fermion partition function admits of a gauge invariant repre- 
sentation but only at the cost of locality. Implementation of fermion averaging of the gauge 
current with the invariant partition function yields the current whose covariant derivative is 
the covariant anomaly. With the covariant current as input one can derive an integrable cur- 
rent whose covariant derivative is the minimal consistent anomaly obeying the Wess-Zumino 
consistency condition. The distinction between the two currents disappears if either the co- 
variant or the consistent anomaly vanishes. This is realised only if the fermion belongs to an 
anomaly-free representation of the gauge group. 


1 Introduction 

In classical field theories there is a correspondence between a global symmetry of the action and a 
conserved Noether current. Presence of short distance singularities which need to be regularised for 
mathematical consistency complicates matters in quantum field theory (QFT). It may so happen 
that a regularisation scheme with mandatory attributes, like gauge invariance in a gauge theory 
of fermions, and at the same time consistent with the global symmetry cannot be formulated or 
simply does not exist. Traces of violation of the global symmetry in the form of non-conservation 
of the Noether current may survive as the regulator is removed at the end of calculation. This is 
the genesis of anomalies and anomalous Ward identities in QFT. 

The topic of anomaly, in particular, axial anomaly came on the centrestage of particle physics 
research through the studies of neutral pion decay into two photons. The decay rate 1.2 x 10^^ per 
sec. was explained satisfactorily by Steinberger^ in 1949 in terms of triangle diagrams (Fig.l) with 
proton circulating in the fermion loop. The linear divergence of the amplitude was regulated by 
the Pauli-Villars method. Problem arose sixteen years later^, when decay rates obtained within 
the framework of current algebra and partial conservation of axial vector current (PCAC) were 
invariably smaller than the data by three orders of magnitude. 

A popular working hypothesis, PCAC derives its dynamical basis in gauge theories of fermion 
like quantum chromodynamics (QGD) from the ‘naive’ operator relation (or, equivalently, naive 
Ward identity) 

= 2m {qj^nq) . . . (1.1) 

*Email: banerjee@boson.bose.res.in 



Chiral Anomalies In Field Theories 


283 


which follows from field equations, with q the quark doublet (u,d) and tz the isospin. One 
recognises in the left hand side the Noether current corresponding to the chiral symmetry u 

“4 e which should be conserved at the classical level in the chiral limit m = 0 

of QCD. PCAC is just the statement that the mass term on the right hand side of ( 1 . 1 ) can be 
replaced by the neutral pion field 


dn {q'yslfiTzq) = . . . (1.2) 

where is the pion decay constant, the pion mass, and the pion field. This is an unexcep- 
tionable step since the mass term has the right quantum numbers of a neutral pion and, therefore, 
can be regarded as the definition of the pion field in terms of quark constituents. 

Problem with PCAC in -)■ 2 'y stemmed from the Sutherland-Veltman^ theorem which 
states that substitution of the divergence of isospin axial current for the neutral pion field in 
the matrix element yields a null result for the decay rate. Coupled with the positive result of 
Steinberger^ , the unambiguous conclusion that emerges from the theorem is that the inadequacy 
of the PCAC relation stems really from the naive relation (1.1) which is flawed if quarks participate 
in electromagnetic interactions. The missing element was diagnosed as an anomaly, the Adler-Bell- 
Jackiw (ABJ) anomaly"^ in the Noether current for chiral symmetry 

/Nc\ 

{qisluTzq) = 2m [q^^rzq) j - • • (1-3) 

where Nc is the colour degree of freedom of quarks and F^p the electromagnetic field tensor. The 
ABJ anomaly, therefore, modifies the ^naive’ PCAC relation (1.2) to 

/ Nc\ ^ 

JF^m^TT^ = d^{q'^5lu.Tzq) + ( 

The decay rate now calculated by substituting the anomaly term for the pion field in the matrix 
element for tt^ -> 27 is given by 

y X l.llx 10^®sec"^ ...(1.4) 

Depending on how one looks at the result ( 1 . 4 ), it may be regarded as either a signal success of 
the diagnosis of the problem in tt® -> 27 as due to anomaly, or in the light of later developments, 
a prediction of the number of colour degrees of freedom = 3 in QCD. 

Success in 27 problem brought into limelight the scenario of breaking symmetries at the 
classical level through anomalies in quantum field theories. Gauge theories become inconsistent 
if gauge symmetry is violated through anomaly. Cancellation of anomalies, therefore, constitutes 
an important constraint in building models for physical gauge theories with chiral coupling to 
fermions. Global chiral anomaly seems to play a key role in discussing physical effects associated 
with topologically nontrivial gauge field configurations. 



2 Axial Anomaly and Fermion Decoupling 

In a gauge theory of fernaion there is a contradiction at the quantum level between chiral invariance 
and gauge symmetry. The ABJ anomaly, or in the more general context of non-Abelian gauge the- 
ories of fermion, the anomaly in the U(l) axial vector current, arises because gauge invariance is 
to be preserved for consistency of the theory. The contradiction is transparent in the condition for 
decoupling® of the fermion from the background gauge field when it is very heavy. For the diver- 
gence of the U(l) axial vector current the decoupling condition assumes the form of an anomalous 
Ward identity 

{dx (^(a;)757AV’(a;))) = [2m{ij}{x)'^5fp{x))] 


...( 2 . 1 ) 



284 


H Banerjee 




(i) (ii) 


Figure 1: Triangle diagrams 


where ( ) denotes that the fermion degrees of freedom are integrated out. As we shall see below, 
(2.1) follows directly from gauge invariance and is known as Adler® condition in QED. Eq.(2.1) will 
still be compatible with chiral symmetry and a conserved U(l) axial vector current would emerge 
in the chiral limit m = O if the second term on the right hand side^were to vanish. But this is not 
to be. The asymptotics in field theory gives in the infinite mass limit the ABJ anomaly 

a2 

[2m(V>(x)75V'(a;)>] = . . . (2.2) 

where Ffj,p = F^^ta are the field tensors with ta the generators of the gauge group. 

To motivate the decoupling condition (2.1) we consider in QED the amplitude for creat- 
ing two photons with momenta and polarisation (p, /x) and {k,!/) by the axial vector current 

(p,P;A;,z/|Ja5(0)|0) = ey,(^)eu{k)Mxiii,{p,k,rn) 

The key to the analysis is the Rosenberg*^ tensor decomposition (see Fig.l) 

= tx^yakaA{p, k, m) -f exva/sPakp k, m) -h kfj,C(jp, fc, m)] + [(fc, z/) ^ (p, p)] ... (2.3) 

which follows from parity and Lorentz invariance. 

Not all the form factors are independent. The form factor A which gives the divergence of the 
axial vector current 


(p “h k^ X^XfjLu — ‘^{k'tPt ^)] * * " 

is determined through gauge invariance by the form factors B and C 

A(p, k, m) = p^B{p, kj m) + p.fc(7(p, fc, m) ... (2.5) 

The form factors B and C are of dimensions [mass]“^, and, therefore, in perturbation theory 
they are represented by highly convergent amplitudes which vanish as for large fermion mass 

lim J5(p,fc,m) = lim C7(p, A,m) = 0 

m-4-oo m->oo 



Chiral Anomalies In Field Theories 


285 


Thus gauge invariance (2.5) guarantees that the divergence of the amplitude for the axial vector 
current given in (2.4) vanishes in the asymptotic m -4 oo limit 

^li^(p + k)xMxf,u = 0 ... (2.6) 

In perturbation theory the amplitude Mxfiu for the triangle diagram is linearly divergent. The 
leading divergence, however, drops out due to symmetric integration of loop momentum leaving 
a potential logarithmic divergence, which can appear only in the form factor A in (2.3). Gauge 
invariance (2.5) rules out even this residual logarithmic divergence. 

The above observations suggest that the potential anomaly represented by the second term in 

(2.1) must be finite and independent of regularisation scheme. To verify this we start by calculating 

[2m{x\Tr'y5{i]p-^ ^)“^k>] , ... (2.7) 

where, to conform to our discussions in the subsequent sections, we work in Euclidean metric 
and write for the hermitian Dirac operator 

...( 2 . 8 ) 

with Afj, = Apay the gauge potential. 

Our strategy is to develop the Green function appearing in (2.7) in a perturbative series 

(ijp-hm)”^ = (“i^4-m)G 

with 

G = „„ 

— Go ~ qGqV Go + q^GqVGqV Go + . . . 

The ‘free’ part is Go = + m^) ^ with = id^ the ‘momentum’ operator. The ‘potential’ gV 

has two pieces 

gV = gVo + 

with ^ (7 m 7i/7m)* ^^st piece gVo is at most linear in p and independent of 7- 

matrices. The trace with 75 in (2.7) starts to be nonvanishing only from terms of order onwards 
in the perturbative expansion (2.9) and one obtains (2.2) 

q2 

lim [2m('ip{x)j5'>P{x))] = lim [2m^{x\Tr {'y 5 G)\x)] = -^e^u\ptrFi^u{x)Fxp{x) 

m—^oo m—^00 iOTT 

Note that the final result is local. All nonlocalities as well as contributions from higher order terms 
in the perturbative series (2.9) drop out in the infinite mass limit oo. 

In the decoupling condition (2.1) one can set the mass term on the right hand side to zero in 
the chiral limit m = 0. The anomalous Ward identity thus obtained 

{d\ . . . (2.10) 

shows that the U(l) axial vector current, i.e. the Noether current corresponding to global chiral 
symmetry 

^ -4 'ip “4 . . • (2.11) 

of the massless Dirac operator (2.8), is not conserved. The divergence of the current is just the ABJ 
anomaly which is responsible for the two photon decay of neutral pion discussed in the preceding 
section. 

We note that in renormalisable theories in perturbative framework the decoupling condition 

(2.1) , which is a special example of the decoupling theorem of Appelquist and Carazzone^, is correct 
to all orders of perturbation, just as the Adler-Bardeen® theorem assures us that the anomalous 
axial vector Ward identity (2.10) is not affected by radiative corrections in QED. 



286 


H Banerjee 


It should be remarked that setting the mass term to zero in (2.1), as was done in obtaining 
(2.10), may not always be legitimate in the chiral limit m = 0 if the gauge field is treated nonper- 
turbatively. The Euclidean Dirac operator (2.8) has zero modes if the background gauge field has 
a nontrivial topology. In this scenario the chiral limit of the mass term does not vanish and, as we 
shall see in the next section, consists precisely of the zero modes of the Dirac operator. 


3 Path Integral Approach to Anomaly 


In a seminal work Fujikawa^ interpreted the ABJ anomaly within the path integral framework 
as arising from the nontrivial Jacobian of the fermion measure under chiral transformation. In 
Euclidean metric, considered by Fujikawa, the partition function which generates fermion Green 
functions in a background gauge field configuration may be written as 


Zf[A] = j d/iexp J m)ipd'^x 


...(3.1) 


where dfj. is the integration measure for fermion and Ij) is the Euclidean Dirac operator defined 
in (2.8). Fermion Green functions which are normalised expectation values of any product 0 of 
fermion fields are obtained from the functional integral (3.1) 

With 7 -matrices chosen hermitian, the Dirac operator (2.8) is also hermitian with real eigenvalues 
Xn and orthonormal eigenfunctions (j)n{^) 

j^n{x) = Xn4>n{x), J (i>!^{x)(f>n{x)d'^X = 6mn ■ ■ ■ (3.3) 

Each nonzero eigenvalue An has its chirally conjugate partner -An with eigenfunctions 0-n 

“ “An<^— nj 4^-n ^ 75 0n • • • (3.4) 

In perturbative field theories one is interested in gauge field configurations with only trivial topol- 
ogy. For such configurations the kernel space of the Euclidean Dirac operator (2.8) is of dimension 
zero. This means that the set {(pn(x)} with nonzero eigenvalues constitute a complete basis in 
function space. The Dirac field 'ip(x) can be expanded in this basis as 


I 


'tp{i]^+ m)tpd‘^x 


(3.2) 


^(^) = X!(“n + o-n75)?i'7i(a:) ...(3.5) 

where a±n are complex- valued Grassmann generators. The four degrees of freedom corresponding 
to each mode of the Dirac field is accounted for if we split a±n as 


^dhn — ^dbn "b 


with a,/3 real valued. 

There are ambiguities^*^’^^’^^ on the issue whether in Euclidean nietric ip should be treated 
as independent of -0. For the present, we follow the popular ansatz^^ and expand '0(x) with an 
independent set of Grassmann generators {bin}? 

- (^) + b-nls) ■ ■ . (3.6) 

The representations for 'ip{x) and 0(a;) together with orthonormality of the eigenfunctions (3.3) 
yield for the fermion action 

Sf{m) = f 'ip(x)(ip-hm)'ij;(x)d^x 

= E [(^^n 4- m) bnan 4- (-iXn -f- m) 


...(3.7) 



Chiral Anomalies In Field Theories 


287 


The integration measure for the fermion fields in the basis {(pni^)} is 

dfji zzz l[l.fidhjid(Xfidb — fido> — fi . . . (3.8) 


The standard rules of integration of Grassmann generators now yield for the partition function 
(3.1) the desired result 


Zf[A] =n„(A2+m2) 
= det(2^+m) 


...(3.9) 


This confirms the correctness of the choice of the measure (3.8). 

Ward identities, whether normal or anomalous, are obtained in path integral framework from 
the requirement that the partition function is invariant under infinitesimal symmetry transforma- 
tions of the variables of integration. To derive the Ward identity corresponding to chiral symmetry 
one implements a ‘local’ chiral transformation of the variables of integration ^(x),t^(x) in the 
partition function (3.1) 

ip(x) -4 f(x) = (1 4- ia(x)75) 

= ^{x) {1 ia{x)js) ..-I • 


The fermion measure (3.8) changes and the new measure corresponding to the transformed variables 
of integration is given by dfji’ 

d/i = Ilndb^dcL^db_^da_^ 

= d/j J[a], • ■ • V • 

where cL^i^n) Grassmann generators in the expansion of ip'{x) (^^'(x)) in the 

basis {(pn{^)}- The Jacobian J[a] can be calculated following standard procedure 


J[a] = exp 


-2i / d^xa{x)Y^{4i{x)^5(j>n{x) + 4>tn{xhs4>-n{3:)) 
n 


...( 3 . 12 ) 


The fermion action also changes and the new action is given by 

Sf{m) S'f{m) = J dt^x [ip(ip + m)ip + ia{x) {-d^ (i?767mV’) + . . . (3.13) 

Invariance of the partition function (3.1) under the infinitesimal local chiral transformation (3.10) 
now gives the anomalous axial Ward identity 

{df, (^757nV’) ) = 2m(^75V’) - 2 ^ ('^n 75<?!>n + (j>trirfb4>n) • ■ ■ (3-14) 


It is easy to recognise (3.14) as the decoupling condition (2.1) of the preceding section. Indeed 


lim 

m— >oo 


[2m(V’75V’)] 


= lim 

m->oo 


= 2E, 


I m -f iXn - 'f'Xn 

n \ 

+ 4>tn754>-n) 



...(3.15) 


where the infinite sum on the right hand side of (3.14) is to be cut off gauge invariantly, |A„i < M 
for M large. Fujikawa used the gauge invariant cut off exp with large M to evaluate 

the infinite sum 

2 E„ + 0i„(x)75<^-n(x)) = ( 756 -^/^') k) ^ ^ 


It is clear that the left hand side of (3.16) should be augmented by zero modes if the Dirac operator 
admits of them. Zero modes always appear with definite chiralities, Cj — il 


P <f>oi = 0 , 75 ^Oi = 


...(3.17) 



288 


H Banerjee 


This is because, in its kernel space the Dirac operator commutes with 75 . In the presence of zero 
modes the left hand side of (3.16) needs to be augmented by their contributions, i.e., 

^€^,XptrF^Ax)Fxp{x) = 2 ^ + <^l„75<^-n) . . . (3.18) 

Space time integral of (3.18) gives the Atiyah-Singer^^ index theorem 

u = g2^2 J d^xtrFiip{x)Fxp{x) — uj^ — . . . (3.19) 

where v is the winding number (Pontryagin index) of the gauge field and n-f (n«-) is the number of 
positive (negative) chirality zero modes. Eigenmodes corresponding to nonzero eigenvalues do not 
contribute to the space time integral (3.19) because (j)n is orthogonal to 75(?^n. Note that nontrivial 
winding number, 1/ ^ 0, is realised through instanton— like configuration of the gauge field. 

The presence of zero modes has profound impact on the chiral limit of the fermion mass term 
on the right hand side of the anomalous axial Ward identity 

(^/x(^7575V’)) = 2m(i/;75^) - ':^e^,u\ptrFp,^,Fxp . . . (3.20) 

The zero modes can be isolated from the mass term 

2m(?^75^) = 2m{^75^)' + 2 ^ • - • (3.21) 

where the first term on the right hand side is bereft of the zero modes and vanishes in the chiral 
limit 2 

2m(^75V’)' = 4 ^ '2 ■ . . (3.22) 

The mass term, therefore, has now a nontrivial chiral limit consisting precisely of the zero modes, 
and the chiral limit of the axial Ward identity is not exactly what was obtained in the perturbative 
framework of the preceding section 

{9^(0757/x^))”^=O “ ^ ^ V ^i^O^i^Oi 167r2 ut/Fxp • • • (3.23) 

The zero modes in the extra piece appearing on the right hand side arise from instanton-like 
configuration of the gauge field and, therefore, could not have been accessed in a perturbative 
framework. 

It is of great interest to note that the zero mode terms which appear explicitly on the right 
hand side of (3.23) are exactly cancelled by similar terms contained now (see (3.18)) in the ABJ 
anomaly- Thus, irrespective of whether or not the gauge field configuration gives rise to zero 
modes, the chiral limit of the local axial anomaly comprises of only nonzero eigenmodes of the 
Dirac operator 

(9M(^757M^))m=0 = -2 ...(3.24) 

|A«l#0 

We, therefore, conclude that the space-time integral of the chiral limit of the divergence of the 
axial vector current always vanishes. This follows from the orthogonality of (j>n and 75<?!>n if one 
uses (3.24) or from the Atiyah-Singer index theorem if instead one uses (3.23) 

j = 0 ... (3.25) 

This, as we shall see later, has a profound impact on issues of physics related to global chiral 
anomaly. 



Chiral Anomalies In Field Theories 


289 


4 Chiral Gauge Theories and the Covariant and Consistent 
Anomalies 


The ABJ anomaly in the U(l) axial vector Ward identity constitutes an unambiguous evidence of 
a fundamental incompatibility of chiral invariance and gauge symmetry in regularisation scheme 
in perturbative framework of quantum field theory. In a vector-like gauge theory, such as QCD, 
chiral invariance is an expendable attribute and the ABJ anomaly results from strict adherence 
to gauge symmetry. In chiral gauge theories where gauge fields are coupled chirally to fermions in 
the Dirac operator 


...(4.1) 


loss of chiral invariance jeopardises gauge symmetry and hence the consistency of the theory. The 
fermion action 


Sf = J 

is invariant under the local chiral gauge transformations 


...(4.2) 


'ip{x) e 


ioi{x)- 




V^(x) ^(a:)e' 


■'ia{x) 


( 1 + 75 ) 

2 


A^{x) 


ia{x) 




-ia{x) 


...(4.3) 


with a{x) = taOia{x) the gauge function. Dimensional regularisation, popular in perturbative 
gauge theories, has serious problem with 75 = :|r 6 ;xi/Ap 7 /i 7 i/ 7 A 7 p- The totally antisymmetric tensor 
of rank four does not admit of suitable generalisation to arbitrary space-time dimensions. 
Thus, one is yet to find a consistent and systematic scheme for regulating divergences in chiral 
gauge theories in weak coupling perturbation in the continuum. 

On lattice, the finite spacing a between lattice sites provides a built-in regularisation of all 
short distance singularities in field theories. Here too, the prospects for a consistent formulation of 
chiral gauge theory are not really bright. The major problem on lattice is the species doublers of 
fermion and their removal. The doublers appear as unwanted zeros of the Fourier transform of the 
Tree’ Dirac operator on lattice, over and above the zero at the origin of momentum space which 
correspond to the physical fermion. In the ‘naive’ Dirac operator ( 7 ^ sin (pp.a) /a) the doublers are 
located at the edges of the Brillouin zone -(Tr/a) <Pfji,< (^/u). The doublers are not specific 
for the naive Dirac operator. According to the celebrated theorem of Nielsen and Ninomiya^^ 
these are generic and can be avoided only at a price, by breaking explicitly locality and/or chiral 
symmetry in the Dirac operator. The most popular model for lattice fermion, the Wilson model^^, 
removes the doublers by giving them masses of the order of the lattice cut-ofiF 0 (l/a) 


Dw{p) = 7 p sin(p^a)/a -hir (1 - cos(p^a)) /a 


...(4.3) 


Gauge invariance is implemented simply through link variables as in all lattice models. But the 
explicit breaking of chiral symmetry for nonzero V’ makes the model patently inappropriate for 
chiral gauge theories. Current spurt in interest in the subject stems mainly from the realisation 
that for lattice Dirac operators D obeying the Ginsparg-Wilson^^ relation 

75 D + Dj5 = aDjsD, ' • • (4-4) 

chiral symmetry is restored and species doublers are removed in the continuum limit^^. The issue 
of nonlocality implied in the Ginsparg— Wilson relation, particularly in the context of chiral gauge 
theories, is yet to be resolved 

Covariant Anomaly^® : Apart from the absence of a consistent and systematic regularisation 
scheme, chiral gauge theories are, in general, afflicted with anomalies in the gauge current. The 
Dirac operator (4.1) in chiral gauge theory is non-hermitian. A fallout of this is that Fujikawa s 
recipe for constructing a gauge invariant partition function, which assumes a hermitian Dirac 



290 


H Banerjee 


operator, needs to be modified. The Dirac operator V in (4.1) maps ip into the space of spinors in 
the domain of D+. The eigenvalue equations (3.3) are, therefore replaced by 

'B4>n ~ '^^Xn ~ ^n^n> • ■ ■ ( 4 - 5 ) 

where are real, nonnegative and constitute the eigenvalue spectrum of and V'^'D. The 
sets of eigen functions {(?i„}_and {xn} of and respectively constitute an orthonormal 
basis for expanding ip and ip 

ip = Y2an<pn, ‘>P = '^Kxt 

n 

in terms of the Grassmann generators an, bn- The fermion measure defined as 

“ LLn^bji^-an 

is a gauge invariant functional of and yields the partition function^® 

Zinv [A] = f dfi[A] exp[/ ipVip(i^x] 

= {detV+Vf^^ = (detP2)+)’'^^ 


...(4.6) 

...(4.7) 

...(4.8). 


Both V'^V and W'^ change by a similarity transformation under gauge transformation. The 
representation (4.8) is thus formally gauge invariant. 

The chiral gauge current 




...(4.9) 


transforms covariahtly under gauge transformation (4.3). Fermion averaging of the current with 
the gauge invariant measure (4.7) yields 


_ fdfi[A] {iptay^^jl ^ 75 )V>) exp [f ipVipd'^x] 

~ ^ Jdfi\A]exp[J ipVipd^x] ...(4.10) 

= Y, - l^)(pn- 

n ^ 

Gauge invariant regularisation can be implemented by suppressing large eigenvalues and the current 
thus obtained transforms covariantly and is called the covariant current. 

Formal application of field equations suggest that the gauge current should be covariantly 
conserved. This, however, may not be true for the fermion averaged current J^{x) if it is anomalous. 


G“(a:) = d^J^^ix) - r'>-Al{x)Jl{x) 

- En {Xt*a|(l + 75)Xn “ - 75)0n} 


...(4.11) 


Following Fujikawa’s^ recipe for gauge invariant regularisation one obtains the covariant anomaly 


G^{x) 


= lim 

M-ioo 

1 


/ 


d'^k 

(2ir)* 


Trta 


i(H-75)e*'=='e-'^e-‘*-* J(1 + 75)e‘*=-*e“^e-**-^ 


— 327r^ 


,(4.12) 


where Ffj^p = taF^j^ are the field tensors. 

Consistent Anomaly^^’^^: In perturbative treatment of chiral gauge theories the fermion 
measure in the partition function is independent of the gauge field. A fallout of this is that, unlike 
Zinv[A] in (4.8), the perturbative partition function 


Zp,rt[A] 

= J dfxexp [/ ipVipd‘^x ] , 


...(4.13) 



Chiral Anomalies In Field Theories 


291 


and hence the effective action W[A] need not be gauge invariant. The gauge current with fermion 
averaging implemented through this perturbative partition function 






5A-{x) 


.(4.14) 


will, in general, not transform covariantly. However, it must obey the integrability condition 

...(4.15) 


SAlix') 6A-{x) 


since it is defined in (4.14) through the functional derivative of the effective action W[A]. The 
current J^^{x) is called the consistent current and its covariant derivative 


G^(x) = d^J^^ix) ~ r^^Al^w^ix) . . . (4.16) 

is the consistent anomaly. 

Gauge transformation properties of an arbitrary functional of gauge fields are best discussed 
with the help of the generators 




...(4.1T) 


Thus the consistent anomaly G^{x), representing as it does the gauge variation of the effective 
action W[A], is given by 

G^y^^{x)^L^{x)W{A] ...(4.18) 

The algebra of the generators 

[L°‘{x), L\x‘)] = - x')L<^{x) . . . (4.19) 

shows that the consistent anomaly must obey the Wess-Zumino^^ consistency condition 

L\x)G\^{x') - L\x')Gl;{x) = rH\x - x')G^^{x) . . . (4.20) 


On the other hand, the anomaly G^ (x) is a measure of the non-covariance of the consistent current 

L\x')j;{o^) = -r^'^5\x - x') + . . . (4.21) 

As for the covariant anomaly (4.12)., one finds, as expected, an incompatibility with the Wess- 
Zumino consistency condition 

L^(x)G\x') - L‘(x')G“(a;) = - x')G^{x), . . . (4.22) 


where the factor 2 on the right hand side spoils consistency. Thus, the anomaly itself is a measure 
of the ‘inconsistency’. The origin of the ‘inconsistency’ may be traced to the fermion measure 
dp[A] given by (4.7) for averaging of the gauge current in the definition (4.10) of the covariant 
current J^(x). A nontrivial covariant anomaly G“(x) corresponds to a nontrivial dependence of the 
measure dfj[A\ on the gauge field. This is suggested also firom the observation that the definition 


= 


5 

(JA“(x) 


IflZifiy [A| 


...(4.23) 


where Zin„[A] is the gauge invariant partition function (4.8), has all the attributes, it is covariant, 
consistent and anomaly firee. The price that one pays for this ‘perfect’ current is a high degree of 
nonlinearity. 



292 


H Banerjee 


It can be shown^® that the consistent current coincides with the covariant current if the func- 
tional curl of the latter vanishes 


dg j d‘^x'At{x') 


SA^^ix) 


I&IX 

6Al{x') J 


. . . (4.24) 


where (a:) is the covariant current corresponding to the Dirac operator -f- (1 - 75)) 

with coupling constant g. One can obtain from (4.24) an explicit representation of the consistent 
anomaly using the expression (4.12) for the covariant anomaly^® 


1^' dgg{l - g)tr {[t,, A^] + A^F^)) . . . (4.25) 

The above analysis shows that the distinction between covariant and consistent currents disap- 
pears if and only if the anomaly in either current vanishes. The fundamental requirement that the 
chiral gauge theory is free of either anomaly imposes the unique constraint on the group generators 
of the chiral fermions 

fr(ta{t6,fc}) =0 ...(4.26) 

which is symmetric in all the indices. An interesting application in the Standard Model is to take 
ta = Q, the matrix of electric charge, and h^tc the isospin matrices. The constraint trQ = 0 is 
obeyed in the Standard Model since each generation of quark doublet of three colours is paired 
with a lepton doublet. 


5 Global Chiral Anomaly and the Strong CP Problem. 

Global U(l) axial anomaly is the sine qua non for the strong CP problem. The problem consists 
in the gross disagreement in the experimental data for the CP violating electric dipole moment of 
neutron (EDMN) which are consistent with a null result and theoretical estimates that invariably 
give a large value. Strong CP problem provides the unique arena where the concept of a global 
chiral anomaly is confronted with direct experimental data. 

The two possible sources for CP violation in QCD action 

Sqcd == Sg + J q{iiP)qd^x J 4- Oqcd^S . . . (5.1) 

are the chiral phase Qew in the quark mass which arises from the electroweak sector of the Standard 
Model, and the QCD vacuum term with parameter 6qcd 

Oqcd^S — j trF^j,j,F\pd!^x ...(5.2) 

In (5.1) Sg represents the contributions from the gauge fields. For gauge fields with nontrivial 
topology the coefficient of 6qcd in (5.2) gives precisely the winding number z/ 7^ 0, 

^ ~ j u,uF \pd X ... (5.3) 

The chiral phase in the mass term in (5.1) can be transformed away by relabelling the quark fields 

q -4 q -> . . . (5.4) 

There relabelling, however, introduces a Jacobian 

■^(Oeu;) = exp / trF^^Fxpd^x 


...(5.5) 



Chiral Anomalies In Field Theories 


293 


where the coefficient of aew in the exponent is 2i/, i.e. twice the winding number of the gauge 
field configuration, which is nontrivial precisely in sectors where instantons live. The relabelling, 
therefore, merely shifts aew to Oqcd giving an effective S 

^ = Oqcd - 2Nfaew • * • (5.6) 

where Nf is_the number of quark flavours. All physical quantities in this scenario, therefore, 
depends on 9 and not on Oqcd or aew individually. Theoretical estimates^^ for CP-violating 
EDMN are all in the range 

0 X ... (5.7) 

Experimental data < 10”^^e.cm, therefore, suggests 6 < 10^^. Such a small value requires near 
cancellation of two parameters 6qcd and as in (5.6), which arise from completely different 
sectors of the Standard Model. This is the strong CP problem, which is essentially a problem of 
fine tuning. 

Attempts to remedy the strong CP problem by invoking a spontaneously broken global chiral 
U(l) symmetry, the Peccei-Quinn symmetry, have been pursued vigorously^^. The idea essentially 
is that the effective 6 becomes a dynamical variable in this scenario involving the field of the 
pseudoscalar Goldstone boson associated with the broken Peccei-Quinn symmetry. The dynamical 
0 could then settle down to a minimum consistent with the conservation of P and CP. The axion 
has been virtually ruled out by experiments and the strong CP problem in its original formulation 
is no closer to a resolution now than it was at the time of its conception^^. 

Question of Global Chiral Anomaly : In view of the prevailing impasse, with axion window 
virtually closed, it is worthwhile to reexamine critically the basic premises that lead up to the strong 
CP problem. The question of a nontrivial global chiral anomaly clearly stands out as the most 
vulnerable among these basic premises. 

The chiral limit of the axial vector Ward identity in a instanton-like background gauge field 
was given in (3.23) 


{du. (iA757MV'))m=o = 2 ^ ■ ■ ■ (3.23) 

where the zero modes 0oi(a^) are a fallout of the nontrivial winding number ly of the gauge field. 
It is natural to identify the right hand side of (3.23) as the density of global chiral anomaly in 
an instanton-like background. Its space-time integral, the global chiral anomaly, vanishes by the 
Atiyah-Singer index theorem (3.19). This patently contradicts a nontrivial Jacobian as in (5.5), 
the cornerstone of the strong CP problem. The popular perception of a nontrivial global chiral 
anomaly and hence a nontrivial Jacobian (5.5) not only leads to the strong CP problem but is 
afflicted with contradictions in the chiral limit. 

The source of these afflictions is easily traced to the popular identification of the partition 
function with the determinant of the Dirac operator 

Zf[A]i,^Q = det(i.^H- m) - . ♦ (5.8) 

which is unphysical in the chiral limit because of zero modes. A key to the problem is provided 
by the theorem^^ which states that there are no wrong chirality zero modes of the Dirac operator 
]p^ i.e., in the Atiyah-Singer index theorem (3.19) positive (negative) chirality zero modes n+{n-) 
are associated with positive (negative) winding number i/. Thus 

dim ker(Di?Di^) = 0, > 0 ^ ^ /g g\ 

dimker(£)i:DH) = 0, 2 ^ < 0 

where Dl, Dr = D'l are the Weyl components of the Dirac operator Ip 



294 


H Banerjee 


The theorem (5.9), therefore, assures that the partition functions defined as 

Zf[A]v>o = det [DrDl + 

Zf [A\:,<o = det {DlDr + 


in the respective gauge field sectors, are not afflicted, unlike (5.8), with zero modes and hence have 
smooth chiral limits. In the trivial sector i/ = 0 the two representations coincide. 

The representations in (5.11) require that instead of the Dirac basis {<j>n{xy) of Sec. 3 we use 
eigenfunction sets of Weyl operators DrDr and DlDr appropriate respectively for positive and 
negative v. Thus for i/ > 0, one writes 



^nL {x) J 


<f>-n{x) 


1 /'-^DL<pnL{x)\ 
-Kiix) ) 


...(5.12) 


where are orthonormal eigenfunctions of the positive definite hermitian operator DrDi 

DRDL<i>nL{^) = ^n^riL{^) • • • (5.13) 

The set {<t)nL{x)} with > 0 provides a complete set of functions in the Weyl basis in z/ > 0 
sector. In the resulting axial vector Ward identity^^ 

)i^>o = 2Tn{'ipy^'i/j) — ^ 157^2 \p ~ ^ • • • (5.14) 


the contribution from the mass term on the right hand side now vanishes smoothly in the chiral 
limit. The global chiral anomaly given by the space-time integral of the chiral limit of the four 
divergence of the axial vector current, therefore, vanishes and instead of (5.5), we now have 


J{a 

ew = exp 

= 1 


-lOCe 




...(5.15) 


The vanishing of the global chiral anomaly means that the chiral phase aew in the quark mass in 
(5.1) is unphysical and can be transformed away trivially by a global chiral rotation (5.4) without 
affecting in any way Oqcd- The vacuum parameter 9qcd remains invariant. The crux of the 
strong CP problem, the problem of fine tuning, therefore, melts away. CP symmetry is ensured 
simply through the natural choice 9qcd = 0. 


6 Concluding Remarks 

Ever since its conception in the context of the problem of neutral pion decay into two photons, 
chiral anomaly has been a topic of abiding interest and challenge in particle physics. The interest 
stems in a large measure from the need to couple fermions chirally to gauge fields in building models 
in particle physics. The challenge consists in formulating a consistent and systematic regularisation 
scheme in chiral gauge theories. 

The paper highlights and elucidates the seminal role of the mass term in the axial vector Ward 
identity in generating the local AB J anomaly and the global U(l) axial anomaly. Gauge invariance 
demands that the fermion gets decoupled from the divergence of the U(l) axial vector current if it 
is very heavy. This identifies the AB J anomaly with the asymptotic limit of the fermion mass term 
with sign reversed. On the other hand, the chiral limit (m = 0) of the same mass term does not 
vanish and consists of contributions from fermion zero modes when the background gauge field has 
a nontrivial topology y ^ 0. The space time integral of the chiral limit cancels the integral of the 
ABJ anomaly, the (sign-reversed) asymptotic limit of the mass term, thanks to the Atiyah-Singer 
index theorem. This suggests, contrary to popular perception, that the Jacobian for global U(l) 



Chiral Anomalies In Field Theories 


295 


chiral transformation is trivial even in an instanton background. The triviality of the Jacobian is 
realised in a representation of the fermion partition function in the Weyl basis (5.11) which has 
a null kernel space. The point of interest in all this is that there is no strong CP problem in an 
axionless physical world. 

Current interest in lattice formulation of chiral gauge theory centres around Dirac operators 
for lattice fermion which obey the Ginsparg-Wilson^® relation (4.4). Apart from redefining chiral 
symmetry on lattice, the Ginsparg- Wilson relation introduces nonlocality It is interesting to 
note that in continuum formulation also it is possible to define a gauge invariant partition function 
(4.8) but only at the cost of locality. Fermion averaging of the gauge current implemented with this 
partition function yields the covariant current. The consistent current which obeys integrabihty, 
can be generated with the covariant current as input. The covariant derivative of the consistent 
current thus obtained yields the minimal anomaly which obeys the Wess-Zumino consistency con- 
dition. Both the anomalies, covariant and consistent, and the distinction between the two currents 
vanish if the fermion belongs to anomaly free representation (4.26). 


Acknowledgement 

It is a pleasure to acknowledge indebtedness to my collaborators Rabin Banerjee, Asit De, and 
Partha Mitra. I should also like to thank Amitabha Lahiri for discussions, and Sugata Mukherjee 
and S.K. Singh for help in preparing the manuscript. 

References 

[1] J. Steinberger, Phys. Rev. 76, 1180 (1949); H. Fukuda and Y. Miyamoto, Prog. Theor. Phys. 
4, 347 (1949). 

[2] R. Jackiw, in current Algebra and Anomalies, Eds. S.B. Tteiman et al (World Scientific, 
Singapore, 1985), p.81. 

[3] D.G. Sutherland, Nucl. Phys. B2, 433 (1967); M. Veltman, Proc. Roy. Soc. A301, 107 (1967). 

[4] S. Adler, Phys. Rev. 177, 2426 (1969); J.S. Bell and R. Jackiw, Nuovo Cimento 60A, 47 
(1969). 

[5] T. Appelquist and J. Carazzone, Phys. Rev. Dll, 2856 (1975). 

[6] S. Adler, in Lectures on Elementary Particles and Quantum Field Theory, proceedings of 1970 
Brandeis University Summer Institute in Theoretical Physics, Vol.l, Edited by Stanley Deser 
et al (M.I.T. Press, Cambridge, Mass, 1970). 

[7] L. Rosenberg, Phys. Rev. 129, 2786 (1963). 

[8] S. Adler and W.A. Bardeen, Phys. Rev. 182, 157 (1969). 

[9] K. Fujikawa, Phys. Rev. Lett. 42, 1195 (1979); Phys. Rev. D21, 2848 (1980). 

[10] K. Osterwalder and R. Schrader, Helv. Phys. Acta. 46, 277 (1973). 

[11] M. Mehta, Phys. Rev. Lett. 65, 1983 (1990); P. van Nieuwenhuizen and A. Waldron, Phys. 
Lett B389, 29 (1996). 

[12] H. Banerjee, P. Mitra and D. Chatterjee, Z. Phys. C62, 511 (1994); H. Banerjee, Ind. J. of 
Phys. (Spl.) 80, 333 (1997). 

[13] M. Atiyah, R. Bott and V. Patodi, Invent. Math. 19, 279 (1973). 



296 


H Banerjee 


[14] H. B. Nielsen and M. Ninomiya, Phys. Lett. BIOS, 219 (1981); Nucl. Phys. B185, 20 (1981). 

[15] K.G. Wilson, in new Phenomena in Subnuclear Physics (Erice, 1975), ed. A. Zichichi (Plenum 
Press, New York, 1977), p.69. 

[16] P.H. Ginsparg and K.G. Wilson, Phys. Rev. D25, 2649 (1982). 

[17] M. Luscher, Phys. Lett. B428, 342 (1998). 

[18] I. Hovarth, Phys. Rev. Lett. 81, 4063 (1998). 

[19] H. Banerjee and R. Banerjee, Phys. Lett. B174, 313 (1986); H. Banerjee, R. Banerjee and P. 
Mitra, Z. Phys. C32, 445 (1986). 

[20] W.A. Bardeen and B. Zumino, Nucl. Phys. B244, 421 (1984); H. Leutwyler, Phys. Lett. 
B152, 78 (1985). 

[21] J. Wess and B. Zumino, Phys. Lett. B37, 95 (1971). 

[22] R.D. Peccei, in CP Violation, ed. C. Jarlskog (World Scientific, Singapore, 1989), p.503. 

[23] S. Coleman, in Aspects of Symmetry (Cambridge University Press, Cambridge, 1985) p.265. 



11. Coherent States in Field Theory 

Wei-Min Zhang 

Department of Physics, National Cheng-Kimg University, 

Tainan, Taiwan 701 , R.O.C. 


Abstract 

Coherent states have three main properties: coherence, overcompleteness and intrinsic 
geometrization. These unique properties play fundamental roles in field theory, especially, 
in the description of classical domains and quantum fluctuations of physical fields, in the 
calculations of physical processes involving infinite number of virtual particles, in the derivation 
of functional integrals and various effective field theories, also in the determination of long- 
range orders and collective excitations, and finally in the exploration of origins of topologically 
nontrivial gauge fields and associated gauge degrees of freedom. 


1 Introduction 

In the past thirty-six years, the developments and applications of coherent states have been made 
tremendous progress. Yet, the idea of creating a coherent state for a quantum system was conceived 
well before that. In fact, back in 1926, Schrodinger first proposed the idea of what is now called 
“coherent states” [1] in connection with the quantum states of classical motion for a harmonic 
oscillator. In other words, the coherent states were invented immediately after the birth of quantum 
mechanics. However, between 1926 and 1962, activities in this field remained almost dormant, 
except for a few works in condensed matter physics [2, 3, 4] and particle physics [5, 6] in 50’s. It 
was not until some thirty five years after Schrodinger’s pioneering paper that the first modern and 
systematic application to field theory was made by Glauber and Sudarshan [7, 8] and launched 
this fruitful and important field of study in theoretical as well as experimental physics. 

I became interested in the subject of coherent states about fifteen years ago. On the occasion 
of Prof. Sudarshan visiting Suzhou of China (1984), I listened for the first time in life a topic on 
coherent states presented by Prof. Sudarshan. As a second-year graduate student at that time, I 
was looking for some research problem on collective excitations in strongly interacted many-body 
systems (particularly in nuclear physics). Prof. Sudarshan ’s lecture inspired me to think whether 
under constraint (s) of dynamical symmetries collective excitations can be described in terms of 
coherent states, as a result of multi-particle correlations (coherence). Later on I realized, this is in- 
deed a very active subject covering problems from condensed matter physics to nuclear and particle 
physics. Of course, these coherent states have no longer the simple but beautiful form Glauber and 
Sudarshan proposed for light beams. Actually, these states are generated by complicated collective 
composite operators of particle-particle pairs or particle-hole pairs. Their mathematical structure 
were already developed in early 70’s by Perelomov and Gilmore [9, 10] based on the theory of Lie 
groups. Newdays, the concept of coherent states has been extensively investigated. Many methods 
based on coherent states have also been developed for various theoretical problems. Nevertheless, 
the original development of coherent states in quantum electromagnetic field (or more precisely, in 
the study of quantum optical coherence) has made tremendous influence in physics. 

One can find that a large body of the literature on coherent states has appeared. This vast liter- 
ature was exhaustively collected, catalogued and classified by Klauder and Skagerstam [11]. About 
the mathematical usefulness of coherent states as a new tool to study the unitary representations of 


E-mail: wzhang@maiLncku.edu.tw 



298 


Wei-Min Zhang 


Lie groups has been described in a well expository book by Perelomov [12]. A review article on the 
theory of coherent states and its applications that cover subjects of quantum mechanics, statistical 
mechanics, nonlinear dynamics and many-body physics has also been presented by author and his 
collaborators [13] . In this article, I will only concentrate on the topic of coherent states in field 
theory. As usual, it is not my intention to give a complete review about coherent states in field 
theory. An extensive review on coherent states in field theory and particle physics may be found 
in [14]. I will rather like to present here a discussion on whether one can formulate field theory in 
terms of coherent states such that the new formulation may bring some new insights to the next 
development of field theory in the new millennium. Coherent state can become a useful and impor- 
tant subject in physics because of its three unique properties: the coherence, the overcompleteness 
and the intrinsic geometrization. These unique properties, in certain contents, are fundamental 
to field theory. I will select some typical topics in field theory that can be efficiently described 
by coherent states based on these properties. These topics include the productions of coherent 
states in field theory, the basic formulation of quantum field theory in terms of coherent state 
functional integrals, the spontaneously symmetry breaking described from coTierent states, and the 
effective field theories derived from coherent states. Also, I will “sprinkle” discussions about the 
geometrical phases of coherent states and their interpretation as gauge degrees of freedom in field 
theory, a subject which has still received increasing importance in one’s attempt to understand the 
fundamental of nature. 


2 Photon coherent states 

I may begin with the simplest coherent state of photons, or more generally speaking, bosons. Such 
a set of coherent states has been described in most of quantum mechanics text books and are 
familiared to most of physicists. It is indeed the most popular coherent state that has been used 
widely in various fields. The coherent state of photons can describe not only the coherence of 
electromagnetic field, but also many other properties of bosonic fields. It is the basis of modern 
quantum optics [15], and it also provides a fundamental framework to quantum field theory, as one 
will see later. 

By means of optical coherence, one may consider the n-th order correlation function of 
electromagnetic field: 


G i • • • , X2n) — tr{p£/ {xi) • ' ’ E • • • jE/'^(x2n)}j (1) 

where Xi is the time-space coordinates, p denotes the density operator, and E^{xi) represent the 
electric field operators with positive and negative frequency. For simplification, the polarization of 
electric field is fixed. According to Glauber [7] the complete coherence of a radiation field is that 
all of the correlation functions satisfy the following factorization condition: 

G (a?! , • * • , iCn, 27n4-l y ’ ' * ) ^2n) “ ^ (^n)^(^n+l ) ‘ * ' ^(^2n)* (2) 

This condition implies electric field operators must behave like classical field variables. It may also 
indicate the electric field operator should have its own eigenstates with the corresponding classical 
field variables as its eigenvalues: 

B+ixm=£i^m , {4>\E-ixi) = (3) 

Moreover, the density operator must also be expressed in terms of the eigenstates | (?!>). Obviously, 
the conventional Fock space in quantum theory does not obey the above condition. 

This is actually a nontrivial problem, because it requires a complete description of classical 
motions in terms of quantum states. Meantime, the operator E^{xi) itself is not a Hermitian 
operator. The eigenstate problem of a nonhermitian operator is unusual in quantum mechanics. 
Fortunately, such quantum states have already been constructed by Schrodinger soon after his 
invention of quantum mechanics in 1926. In order to answer the question how microscopic dynamics 
transits to macroscopic world, Schrodinger looked for quantum states which follow precisely the 



Coherent States in Field Theory 


299 


corresponding classical trajectories all the time, and meantime, the states must also be the exact 
solution of quantum dynamical equation (i.e., the Schrddinger equation). But only for harmonic 
oscillator, such states were constructed [1]: 


<l>zix) 


rsj 


exp 


1 , 

-{x + z 



( 4 ) 


where z is a complex variable. These states axe actually the Gaussian wave packets centered on the 
classical trajectory z = (x x and p are the position of harmonic oscillator in the phase space 
that satisfies classical equations of motion. One can show that Eq. (4) is also an exact solution of 
Schrodinger equation. The classicality of Gaussian wave packets are manifested by the minimum 
uncertainty relationship; 




4 


and Ap = Ax. 


( 5 ) 


In other words, the wave packets governed by the Hamiltonian of harmonic oscillator follow classical 
trajectories and do not spread in time . 

Glauder and Sudarshan discovered [7, 8] that such a wave packet is a superposition of Fock 
states. It is also an eigenstate of E“*"(xi). In quantum field theory, electromagnetic field con- 
sists of infinite harmonic oscillating modes (photons). Explicitly, the Hamiltonian of quantum 
electromagnetic field (in Coulomb gauge) is given by 


H^-^jd^x{E^ + B^], ( 6 ) 

where E and B are the electric and magnetic fields. The electromagnetic field can be expressed by 
the vector potential A : E = B = V x A. It is convenient to expand the vector potential 

in terms of plane waves (Fourier series) 

A(a:, t) = I + af (7) 


where £^(k) is the polarization vector of electromagnetic field, and are the creation and 

annihilation operators, 

[^k 5 ~ ^Xy^kk'i [Ofc.) = 0- (^) 

Then the Hamiltonian of electromagnetic field can be deduced to 

H = ^u;*(afo^ + l/2), (9) 

k\ 

which means that the electromagnetic field consists of infinite individual electromagnetic modes, 
i.e., photons. Each photon corresponds to a harmonic oscillator. 

In the particle number representation, the Gaussian wave packet can be written as 

| 2 !) = expi-^\zf) exp( 2 ot)| 0 ). (10) 

where |0) is the vacuum state: o|0) = 0. Prom the above expression, it is easy to show that the 
wave packet is also an eigenstate of the annihilation operator a: 

a\z) = z\z)^ (11) 

Thus, the quantum state describing the optical coherence of electromagnetic field can be expressed 

\{zj:}) = exp{ - d^k^\zi^\^}exp{ j d^k'£zj:4^}\0), 


( 12 ) 



300 


Wei-Min Zhang 


which is an eigenstate of the positive frequency part of the electric field operator, 

E+M|«)) = f(:.)|{4}> , m = i I (13) 

Besides, the above state has another very important property: it supports the following resolution 
of identity: 

= (14) 

In other words, these states in the complex space (in terms of the variable z) form a complete set of 
states (more precisely speaking, it is overcomplete because of the continuity of these states). This 
complete set is certainly very different from the set of Fock states. Because of the overcompleteness 
and the analyticity of these states, one can expand the density operator by (12) in a diagonal form 
(the so-called P-representation [7]): 

P = [ 

k\ 

txp = f P({zj^}) n dztdz^* = 1. (15) 

kX 

where P(z) is a weight function. In terms of these states (12), the factorization criterion of coherent 
light beams is automatically satisfied. Glauber named such states the coherent states. To be more 
specific, one may call them the “photon coherent states” . Physically, the photon coherent states 
have a well-defined phase for each mode. Therefore, coherent light beams can be completely 
described in quantum mechanics in terms of photon coherent states. For these who wish to have 
more detailed discussion on physical consequences of the photon coherent states in quantum optics, 
please refer to the excellent book by Klauder and Sudarshan [15]. 


3 Coherent states and S'-Matrix 

As we have seen, the photon coherent state was introduced by the requirement of optical coherence. 
Here, I may ask a more general question, namely, how are photon coherent states generated in field 
theory? In field theory, all physical quantities are derivable from the vacuum-to- vacuum transition 
amplitude in the presence of external sources. It can show that the final state in such processes is 
a coherent state if there is no other interactions except for a linear interaction with the external 
field. 

To be specific, one may consider the electromagnetic field interacting with a classical source: 

£ = (16) 

where the classical source is a conserved current: = 0. In the Feynman gauge, the equation 

of motion for the electromagnetic field is given by 

d^F^'' = d^^A'' = f. (17) 


A general solution of the above equation is 

= A^ix) + J d^yAix - y)j'^{y), (18) 

where Aq{x) is the solution of free field, and A(x — y) is the Green function determined by 
5a:^A(x - 2/) = — y). If one assumes that the interaction is switched on adiabatically in a 



Coherent States in Field Theory 


301 


finite time interval, then 


= Ki^) + 1 d^yAretix-y)riy) 

- "^out(!^) + J <i‘^yAa^v(x -y)j>^{y), (19) 

whGr6 th© retardsd and advanced Green functions are given by 

1 f Q-ipx 

^11 J ^P(p^±ie)2--p2’ ( 20 ) 

and and are free fields describing the photon field before and after its interaction with the 
classical course The corresponding photon states are the in- and out-states (denoted by |)in 
Dout) respectively). The in- and out-states form two complete sets of free states constructed 
as a Fock space by the free field operators and A^^^, Therefore, there must exist a unitary 
transformation 5 (namely 5-matrix) to connect these two complete sets: 

<,, = 5t<5, |)out = 5t|)i.. (21) 

From (19), one can see that 

^out(a:) = + j (T^y[A,ei{x-y)- A^iy{x -y)]j^j,{y) 

= Af„(x) + A^,(a:), (22) 

and A^^{x) is a c-number (classical) field generated by the classical current Notice that 

Aadv(2:) ~ Aret(a^) = A(x) which relates to the commutator of free fields [^(^(a:), A|'j^(2/)] = 
[-4out(^)5^out(2/)] = —ig^^A{x - y). One may check that the 5-matrix can be written as 

5 = exp I - i j d^xAin • i(a;)| = exp I - i J d^xAont • j{x)Y (23) 

If we start at time — oo from the vacuum state |0)in, the final state after the free field A^{x) 
interacted with the classical current j^{x) becomes a coherent state: 

|0)out = exp |i j d^xAir,(x) • j(a:)}lO)in. (24) 

In terms of the Fourier expansion. 


The final state can be expressed as 

|0U = ejcp{-i/<J»i2kr}e*p{ I = (“) 

A A 


where = e^{k) -jik), and j^{k) is the Fourier transform of the classical current j'^ix). This is 
the same photon coherent states introduced by Glauber in the study of quantum optical coherence. 

Indeed, one can derive similarly the photon coherent state for the laser beams (discussed in 
the previous section) from a more microscopic picture. The Hamiltonian in quantum optics that 
describes the interaction between N atoms and the electromagnetic field can be written as; 


H = Y^Uka\ak 

k i 




J 


(27) 



302 


Wei~Min Zhang 


where jki are the coupling coefficients between atoms and electromagnetic field. One of the crucial 
assumptions made in the construction of the above Hamiltonian for laser beams is that each of the 
N atoms, labeled by the index i/is a two-level system and therefore its dynamical variables are the 
usual Pauli operators crl}. Furthermore, the atomic variables can be treated as a classical 

source (i.e. the spin operators can be regarded as c-numbers), and the coupling strength are 
identical for all the atoms (i.e. 'jki = 7a;)- Then, Eq. (27) is reduced to 





(28) 


This corresponds to the electromagnetic field interacting with an external time dependent source. 
The general solution of the Schrodinger equation for this Hamiltonian is the photon coherent 
states. This provides the microscopic picture how the photon coherent state is generated and why 
it becomes the fundamental of quantum optics. 

Soon after the development of coherent states in optical coherence, it was found that the 
photon coherent state also plays an important role in solving the infrared divergence in quantum 
electrodynamics for electron scatterings [16, 17, 18, 19, 20] (also see the review by Papanicolaou 
[21]). As it is well known, the matrix element in quantum electrodynamics for the scattering of 
an initial state containing a finite number of electrons and photon into a similar final state has 
a logarithmical infrared divergence for the small momentum k [22]. This is because in an actual 
scattering experiment, electromagnetic fields interact with the source particles so that an infinite 
number of soft photons are emitted. These emitted soft photons form a coherent state to the final 
state, as we have discussed. To be more specific, consider a single electron scattering. The source 
particle can be represented by a classical current. The Fourier transform of the classical current is 
given by 


fik) 


/ P/ _ fi \ 
V2(27r)3|k| \pf ■ k Pi-kJ^ 


(29) 


where pij are the electron’s momentum in the initial and final states. If one sums the cross 
sections over all possible final states containing any number of soft photons with momenta below 
the threshold of observability [by using the photon coherent state (26) with the above classical 
source], the infrared divergence is canceled. This gives a beautiful solution to the infrared problem 
in quantum electrodynamics. 

Moreover, one can also show that if the matrix element for scatterings are calculated with the 
initial and final states containing infinite number of soft photons by the photon coherent state, the 
infrared divergences are canceled order by order at matrix element level (not only in cross sections) 
[16]. The photon coherent stat may also use to remove the similar infrared problem in quantum 
gravity, as noticed by Weinberg [23]. These are perhaps the second important applications of the 
photon coherent state in field theory. In addition, one has also attempted to use coherent states 
to treat infrared divergences in non-abelian gauge theory [24]. However, in the non-abelian gauge 
theory, the infrared divergence is much more complicated [25, 26]. It contains two- type infrared 
divergences, the massless soft infrared divergence and collinear divergence. It is not clear whether 
one can construct some non-abelian coherent states to handle both the soft and collinear infrared 
divergences. 


4 Functional integrals in field theory 

When the quantum fields interact with quantum fields rather than classical external fields, the 
5-matrix (or the time-evolution operator) does not generate coherent states from the incoming 
vacuum. In such cases, coherent states are useful in the derivation of functional integral in field 
theory. Quantum field theory can be reformulated in terms of coherent states not only because 



Coherent States in Field Theory 


303 


of its classicality and being eigenstates of the annihilation operator. As we have mentioned, the 
coherent states are overcomplete: 

/■ , dzdz* _ , ^ 

J ™ 

All these three properties (the classicality, the eigenstates of the positive frequency part of field 
operator and the overcompleteness) together allow one to reformulate quantum field theory in 
terms of a functional integral. Actually, the content of this section can be found in many text 
books, but for completeness I will repeat these discussions here. 

The ordinary path integral of quantum mechanics developed by Feynman [27] can be obtained 
from the evolution operator by writing the evolution operator as 

to) = exp I - - to)| = lim ^ exp 

and then inserting a resolution of the identity in terms of the position states 

J dx\x){x\ = I 

between the terms of above product. This results in the familiar path integral of 
chanics, 

<x'(t/)|x(to)) = {x'\U{tf,to)\x) 

[dx{t)] exp dtC(x{t), i:(t))|, (33) 

where £ is a classical Lagrangian which generally has a form of 

1 / dnr \ 2 

and [dx{t)] = nto<t<t/ a functional measure of the path integration [28]. 

Instead of using the basis of the position eigenstates (32), we may use the coherent state basis 
and insert the resolution of identity (30) between the terms of product (31). Then a phase space 
formulation of path integrals can be obtained as was first proposed by Klauder [6, 30] (More 
detailed derivation will be given later in the application to field theory), 

{z'{tf)\z{to)) = {z‘\U{tf,tQ)\z) 

= j[dx(t)][^^-]exp^i dtC{x{t),p{t))^, (35) 

with 

C{x,p) = {z\if^\z)-{z\H\z} 

= - 4 ) -«(->• <“> 

where z = (x + ip)lV^ and z* = {x - ip)l'/2, with the initial and final positions x{to) and x(tf) 
fixed. This derivation of Feynman’s path integral is particularly useful in obtaining a functional 
integral of quantum field theory. 

To derive a functional integral of quantum field theory, we may start with a neutral scalar field 
(^(x), for simplification. The Lagraiigian density of a neutral field is given by 

C=l[{dct>f-m^‘^]-V {<!>), 



( 31 ) 

( 32 ) 

quantum me- 


(37) 



304 


Wei-Min Zhang 


where F(0) represents a self-interacting potential, such as The canonical momentum den- 

sity conjugate to 4>ix,t) is determined by 7r(x,t) = dC/d^{x,t). Then the canonical quantization 
leads to 

[<pop{x,t),-!ropix' ,t)] = iS^ix - x'), (38) 

In the plane-wave expansion, one has 

,p,p{x,t) = + 

-Kop{x,t) = -i J ^ 2 ^^ 3/2 (39) 

and the quantum Hamiltonian can be written as 

H{t) = j d^x : {^[tTop + {S7<l>opf + rn^<l>lp] + y{<t)op)] ■ (40) 

here : : denotes the normal ordering with respect to the creation and annihilation operators a\ 

and ak. Now, one can define the scalar field coherent state as 

|07r) = expji J d^x[ir{x)(l)op{x) -- (j){x)7rop{x)]^\0) 

= exp I - i J d^k l^ifcpjexp | J d^k ( 2 ;fc 4 )|i 0 ), (41) 

from which a functional integral of field theory can be derived explicitly. Note that the coherent 
state in field theory is defined at a given instant time t over the whole space {x} . 

Since the Hamiltonian formalism of field theory is the same as in quantum mechanics, one can 
directly calculate the time Green’s function defined as the matrix element of the evolution operator 
in coherent state basis, 

G{tf,to) = {4>'Tr'\U(tf,to)l(pw) = (<i>''7r'\Texp^-iJ^ dfif(f)||ij!>7r), (42) 

where T is the time-ordering operator. One may slice the time interval tf — to into N equal 
segments: e = (tf — to)/N so that in the sense of iV oo, the evolution operator can be written 
as a subsequently multiplication of the evolution operator in the interval e: 

Uitf.to) = exp I - 2 eil(tn)}exp| - 

* • • exp I — exp | - ieH{ti)Y (43) 

Using the same procedure as in the derivation of Feynman’s path integral in quantum mechanics, 
one should insert the resolution of identity, 

= (44) 

at each interval point, where [rf0(x)] = n-oo<x<oo defined over the whole space. 

Then 

G{tf,to) =^li^ j f n Wi(x)][^^^] j JJ(«i'i7ri|exp| - ieid(fi)||</ii_i7ri-.i). (45) 
Up to the first order in e, 


-ie 


(^iTTj TTi—i ) J 


{(j)iiTi\exp{-ieH{ti)) \(j)i-iTii-i)) » ((?!>i7ri|^i_i7ri_i)exp 


(46) 



Coherent States in Field Theory 


305 


Note that the coherent state l^ir) is normalized. In the limit of e -> 0 (i.e. JV -4- oo), 

= 1 - {(j>iT:i\{\(f)iiri) - |^i_i7ri_i)) 

o; (47) 

where A|<^j7ri) = \4>i'^i) ~ Then, the Green’s function becomes 

= Jim I n - (te|ff(i.)|*x.)} 


with 


J [d<j){x)][—^^-] exp |z I^tt) (07r|i]r|(;i!>7r)j | 

j exp |i ^ dtj d^x - (jyk) - 

^ = ^k^ + (V^)^+mV] + m- 


(48) 

(49) 


As we see that the coherent state gives a natural derivation of path integrals in field theory. 

In field theory, the correlations between n fields are defined by the n-point Green functions, 


• • • ,a:„) = {0|r(^(xx) • - • ^(x„))|0), 


(50) 


which can be determined from the generating functional W{J) which is defined as the vacuum-to- 
vacuum amplitude in the presence of external current J(x): 


I^(J) = (0lC7(-oo,oo)|0)j. (51) 

This generating functional can then be expressed in terms of the time Green’s function G{tf,to) 
by adding a term / d^xJ{x)4>{x) in the exponent and then talcing to ^ -oo and t/ -)■ oo: 

W{J) = j [#(x)][^^|^]exp I j - #) - H{x) + J(x)</>(x)] | 

= J [d0(x)]exp I J d^x|^£(x) + J(x)<^(x)j | = expiZ(J), (52) 

and Z{J) is a functional partition function in quantum field theory. The n-point Green’s functions 
containing only the connected graphs is given by, 


G(”^(xi,---,x„) = 


(-f)" 5^W{J) 


W{J)SJ{xi)---5J(xn) 


= i-ir 


5^Z{J) 


j=o 


6J[Xi) -’5J{Xn) 


(53) 


J=0 


Taking the stationary phase approximation of W ( J) naturally results in the classical equations 
of motions [29, 30]. On the other hand, after integrating out the 7r(x) field, the functional integrals 
W{I) and z(j) become covariant. Now, all physical quantities in field theory are, in principle, 
derivable from W{J) or Z{J) in a covariant form, which are standard in text books. I should not 
repeat these discussion here. 

The above formulation is only for bosonic fields. Field theory that describes the real world 
must also involve fermion (matter) fields. To formulate a similar functional integral for fermionic 
fields, one needs to introduce fermion coherent states. Similarly, one may try to construct such a 
coherent state as an eigenstate of the fermion annihilation operator: 

Ci|0 = fil^) . {Ci,ct} = %. (54) 

However, since the fermion creation and annihilation operators satisfy the anticommutation rela- 
tionship, the eigenvalue of the annihilation operator is its classical analogy which cannot be an 



306 


Wei-Min Zhang 


ordinary number. The quantum-classical corresponding principle simply requires that must be 
anticommute; 

= = , ^f = 0. (55) 

The numbers satisfies the above relations are called Grassmann numbers. Functions of Grassmann 
numbers are given by 

= f + + (56) 

and the Grassmann integrals are defined as 

= Jd^^ = l. (57) 

Using these properties, the fermionic coherent state can be written explicitly as 

\0 = exp { - i exp {^i4}|0), (58) 

i 

where [0) is the fermion vacuum state: Ci|0) = 0. For the fermionic coherent states, the resolution 
of identity can be written similarly as 


/n<6i?)(a=/- (59) 

Based on these properties of fermionic coherent states, the functional integral of fermion fields 
is rather easy to derive [31]. Consider a fermion field 0 coupling with a scalar boson field (p, the 
Lagrangian is 

(p) = ^ - m)'ip + ^[{d(p)^ - m^(p^] - g(p^p'lp, (60) 

Following the same procedure, one obtains the functional integral 

W{Z C^J) = I [d?]M[#] exp { J d^x [£(?, e, <i>) + + ec + ^«^] }, ( 61 ) 

where the fermionic sources C*, Q are also Grassmann numbers. In fact, Schwinger first introduced 
such a generating functional for fermion fields, in order to derive the fermion field Green’s functions 
[5]. 

These results can be extended to quantum electrodynamics and quantum chromodynamics, 
although the later will be more complicated because of non-abelian gauge fields [32, 33]. In most 
of text books, the discussions on coherent states in field theory are restricted usually in contents 
of the above formulation. One may derive from such a formulation almost everything about 
perturbative field theory, such as Feynman rules, the perturbation expansion, and renormalization 
analysis, etc. At this point, the functional integral of quantum field theory in terms of coherent 
states is actually nothing special. It is the standard formulation that one can also obtain from 
other methods. If the field theory can be treated perturbatively, one can always solve the theory 
in one or other ways, based on the developments of field theory in the last fifty years. The 
challenge in field theory we faced today (or in the past three decades since the theory of the 
strong interaction, namely quantum chromodynamics, was proposed) is in the nonperturbation 
section. There is no a systematic approach in field theory that one can used to completely solve 
a nonperturbation problem, such as the vacuum structure in non-abelian gauge theory, or bound 
state problem in strongly coupling systems. In the next few sections, I will try to illustrate some 
specific problems to see if the generalized coherent states developed later can play some useful roles 
to the nonperturbation field theory of strongly interacted systems. 



Coherent States in Field Theory 


307 


5 Squeezed Coherent States and Quantum Fluctuations 

The first example I go to discuss is how one may use squeezed states to study the low energy 
quantum fiuctuations in strong interaction theory. Different from what is discussed in the previous 
section where the content may be found in text books, here I should point out that the formulation 
presented in this section has actually not been completed yet, and more work remains for further 
investigations. 

The squeezed states are a generalization of photon coherent states. Aagin, the squeezed states 
first attracted the attention in quantum optics [35, 36]. In the early development, the principal 
potential applications of squeezed states are in the field of optical communications and “quantum 
nondemolition experiments” designed for the detection of gravity waves [37]. Later on, because 
of the “capacity” of treating quantum fluctuations, squeezed states have been used in various 
subjects, such as quantum measurement theory, quantum nonlinear dynamics, molecular dynamics, 
dissipative quantum mechanics as well as in quantum gravity and condensed matter physics. 

In quantum optics, the uncertainty principle places a damper on the enthusiasm with which 
quantum engineers approach the problem of coding and transmitting information by optical means. 
Specifically, the quantum noise inherent in hght beams places a limit on the information capacity 
of an optical beam. Since the uncertainty principle is a statement about areas in phase space, 
noise levels in different quadratures are statements about intersections of uncertainty ellipses with 
these axes. Any procedure which can deform or squeeze the uncertainty circle to an ellipse can in 
principle be used for noise reduction in one of the quadratures. Such squeezing does not violate 
the uncertainty principle; rather, it places the larger uncertainty in a quadrature not involved in 
the information transmission process. A typical procedure for squeezing the error ellipse involves 
applying a classical source to drive two photon emission and absorption processes in much the same 
way that single photon processes can be used to generate a coherent state of the electromagnetic 
field. 

For simplification, one may consider a basic Hamiltonian describing two photon processes in a 
single mode 

H = -b (62) 

Then, the squeezed state can be obtained by directly solving the time-dependent Schrddinger 
equation. If the initial state is the photon vacuum, a general solution is 

1^) = exp 10)6'-^. (63) 

This is the squeezed states generated by Eq.(62). If one defines the photon’s position and mo- 
mentum coordinates (x^p) in terms of the creation and annihilation operators as in the previous 
section, one can then find that 

= i|cosh|;31 + 

Ap^ = = ||cosh|/3| - , (64) 

and Aa; Ap but ArcAp > \ (here we set % = 1). While the vacuum has a circle uncertainty 
{Ax = Ap) in phase space. This shows that the operator Dsq{0) = exp squeezes 

the uncertainty circle of a wave packet into an ellipes so that quantum fluctuation (noise) can be 
reduced in one of the quadratures. 

In general, it is desirable to squeeze a field coherent state which can be generated by 

H = Lj{a^a + i) + f2{t)a^^ + +/2* + /i + fi (t)^- (65) 

The sequence in which the processes of coherent state formation and squeezing occur is governed 
by the time dependence of the functions f^it) end /i(t). The general form of the state at the time 



308 


Wei-Min Zhang 


t can be expressed (apart from a phase factor) by 

\zj5) = exp exp ~ |0), (66) 

where the complex variables z and /3 are functions of the time t in general. Eq, (66) is usually called 
squeezed coherent state. The physical process of the squeezed coherent states can be understood 
as follows: by first squeezing the vacuum (the wave packet) by the two photon excitations, and 
then displacing it as a photon coherent state by the external source. 

Using group theory, one can show that the squeezed coherent states must also form a overcom- 
plete set of states. 

= / (67) 

where the variables / and g are introduced by Jackiw and Kerman [38] to characterize quantum 
fluctuations (noise) of the position and momentum: 

i + 4/p^ (68) 

These two variables relate to the squeezing parameter 13 by (64). By the completeness, one can 
also derive a path integral of quantum mechanics in the squeezed coherent state representation: 

X exp ^ dt [i (pi - xp) - fg- } , (69) 

where the effective Hamiltonian is the matrix element of the Hamiltonian operator [H = p^/2 + 
V’(x)] in the squeezed coherent state: 

^eff(a:,P,/,^) = ^P^ + ^+2// + exp(/^)y(x). (70) 

The expression of path integral in squeezed coherent states shows that / and g which characterize 
quantum fluctuations become a pair of conjugate variables. The extremal values of the exponent 
in the path integral leads to the following generalized equations of motion: 

^ = dn,sidp, ^ = -dn,sidx, 

^ = dn,fi/dg, ^ = -dn,sldf. (71) 

Physically, the equations of motion for (a:, p) determine the time evolution of the center of wave 
packets and those for /, g characterize the time evolution of the quantum fluctuations (quadratures). 
Therefore, the variables {f,g) describe the squeezing and spreading of quadratures in times, which 
provides a classical-like dynamical theory for the controlling of quantum noise and signal. 

It is worth pointing out that although the concept of squeezed state first attracted the attention 
in quantum optics, squeezed state itself was introduced much earlier by Valatin and Bulter in the 
study of superfluidity [4, 39]. Similar to the BCS state of superconductivity (which I will discuss 
later), Valatin’s superfluid ground state is defined as 

\{zk/3k}) = exp|^(^*o| -z^afc)|exp|^h/?fc4aL;(^ -/3;JaAa_fc)J|0) 

k k 

= exp I Yl(zkal - z^a*)) !{/?*}) 
k 


( 72 ) 



Coherent States in Field Theory 


309 


which is the standard form of squeezed coherent states one currently used. In many-body picture, 

such states have two consequences. The squeezed operator 

acting on the trivial vacuum generates a canonical transformation of quasiparticles: 

a* = coshl/3/i;|ofc - p^sinh|/3iblalj. , Q/tK^jb}) = 0. (73) 

Then using the quasiparticle vacuum to generate a bosonic coherent state. With such a state, one 
may develop a microscopic theory of superfluid helium, in which the normal and superfluid states 
become a direct analogy of noise and signal in partially coherent radiation fields. 

We now use the squeezed coherent state to formulate a possible theory that may be useful in 
addressing the low energy quantum fluctuations in field theory. Let us consider again the neutral 
scalar field theory) as an example. One can define the squeezed coherent state of the field (f) 
as [40] 

= ATexpji j d^x[7r{x)(f}opix) - 4>{x)irop{x)]^ 

xexpj j (fx(fy[(l>opix)Dix,y)(l)op{y)Yo) (74) 

where is & normalization constant. This squeezed coherent state is also defined at a given instant 
time so that t = tx = ty. One can show that: 

(’®'|fl^op(x)|$) = <(>{x) , ($|xop(x)l$) = Tr(x}, 

(^l<Pop(x)<^op(y)i^) = 4>{^)4>{y) + 

('$'|7rop(a:)7rop(2/)|$) = 7r(x)7r(y) + j<k~^{x,y) 

+ Aj d^x'Sy'Tl{x,x')^x' ,y')ll{y' ,y), (75) 

where 

D{x,y) = i[$oHa;, 2 /) - #"^(i,j/)] + 2m(x,y), 

^o(a;,y) = (0|9i(,p(ar)<?iop(y)|0). (76) 

The squeezed coherent states of bosonic field are also overcomplete, namely 

I mx)][^]mx,y)][^^^mm = i. ( 77 ) 

Following the similar procedure discussed in the previous section, one can derive the functional 
integral W{J) in the squeezed coherent state representation: 

X exp I J — 0^) — M - HeEix)] + J(x)<j>{x)^ (78) 




310 


Wei-Min Zhang 


and A{x, t) = y) - $o(a;, 2/)]. Here I have not integrated out the conjugate momentum 

{ 7 r( 2 :), n(a;)} to obtain a covariant functional integral. The physical picture of this new formulation 
is that besides the original field variable ^i(a:),7r(x) in the Hamiltonian formulation, quantum 
fluctuations, characterized by $(a;, y) and n(x, y), are introduced as new dynamical field variables. 
These new dynamical field variables describe the low energy excitations (.i.e. the composite 
particles) of strongly interacted systems. Similarly, taking the extreme value of the exponent in 
(78), one can find the equation of motion that determine the classical-like solution cf>o' 

(V-m2)^o + exp(^^)y'(^o) = 0, (80) 

which is coupled with the composite field $. The equation of motion for its quantum fluctuations 
by § is much more complicated. 

Usefulness of the squeezed state functional integral is that one can derive an effective theory 
for the low energy composite particle fields coupling with the original fields. Here I may propose 
a procedure how to develop such an effective theory. First, one can determine the “classical” 
ground state by minimizing the effective Hamiltonian with respect to variables 0, tt as well as H, 
which results in 0o5(7ro = 0) and <^Oj(no = 0). Then, expanding the effective Lagrangian near 
(005 ^ 0 ) ^05 Ho) up to the second-order, namely only keeping the quadratic terms in ((50, Stt, (5#, SU). 
Quantum effects become the time-dependent fluctuations about the classical ground states. The 
corresponding linearized equations of motion determine the dispersions of quasiparticles and com- 
posite particles, denoted by Uk and 7 a;, respectively. For strong interaction field, usually u)k > Jk 
(due to the spontaneously symmetry breaking). Thus, in the low energy scale ( 0 ;^ > l^'> Ik) ^ the 
composite particles and the quasiparticles are decoupled. Only the composite particles are kept 
with the high order corrections (as a perturbation) to form the low energy effective Lagrangian. 
In the intermediate energy scale (/z '•n-' 0 ;^.), both the composite particles and quasiparticles become 
active and coupled each other. The effective theory is then determined by the Lagrangian of the 
composite particles coupled with quasiparticle degrees of freedom. In a rather high energy scale 
(/i >> Wjfc), $ and n should spread averagely over the entire space-time space such that only the 
original field variables remain. Thus, the theory returns back to the original one in high energy 
region. 

The above procedure is different from the conventional procedure of constructing a low energy 
effective theory. In the conventional approach, the low energy effective theory is constructed 
by separating the field variables into the low energy and high energy parts. Then, using the 
functional integral discussed in the previous section to integrate out the high energy part. The 
resulting Lagrangian is an effective Lagrangian for the low energy physics. The advantage of the 
conventional approach is that one can use the powerful renormalization group analysis of Wilson 
[41] to extract universal scaling properties contained in the theory, without explicitly solving the 
theory itself. However, in reality, physical degrees of freedoms must also be very different in 
different energy scales. A typical example is the strong interaction in which the degrees of freedom 
are quarks and gluons in high energy region. But in low energy region, the degrees of freedom 
become hadrons which are composite particles of quarks and gluons. Thus, in the conventional 
approach, the effective theory cannot catch the right physical degrees of freedom. Therefore, 
beyond the critical phenomena, tne conventional approach may have its limitation in applications. 
The squeezed coherent state formulation of the functional integral may provide a new method for 
the developments of effective field theory, although at this point a lot of work remains for further 
investigations. 

A potential application of the squeezed coherent states in field theory is the Yang-Mills gauge 
theory, especially the color SU(3) gauge theory in quantum chromodynamics. Of course, the situ- 
ation in quantum chromo dynamics is much more complicated. Because of the nonlinear properties 
in non-abelian gauge theory, the conventional functional integral is already quite complicated. 
However, the conventional functional integral in non-abelian gauge theory is only useful for the 
derivation of covariant Feynman rules and the analysis of renormaliability. In other words, it is 
only useful for perturbation calculations. As it is well-known, the difficulty of QCD lies in its non- 
perturbation domain, where quantum fluctuation must be strong. Furthermore, the field strength 



Coherent States in Field Theory 


311 


of non-abelian gauge contains the single as well as double gauge boson emissions and absorptions: 

F;, = ^ - gr^^4At. (81) 

The squeezed coherent states may be the natural quantum states describing non-abelian gauge 
fields. If one can complete and extend the above formulation to the non-abelian gauge theory, then 
it may be able to develop a low energy effective theory for non-abelian gauge fields. I believe that 
such a low energy effective theory should be capable in dealing with gluon condensation, gluball 
states as well as low energy gluon dynamics. 


6 Spin Coherent States and Non-Linear Sigma Model 

So far, I have only discussed bosonic-type coherent states in field theory. However, coherent states, 
in terms of the language of group theory, are embedded in a topologically nontrivial geometri- 
cal space which involve a deep implication in physics [13]. The simplest coherent state carried a 
topologically nontrivial geometrical space is the spin coherent state. The most attractive prop- 
erty in spin coherent states is that its topological structure naturally induces Dirac’s magnetic 
monopole [34]. Meanwhile, the spin coherent state representation of the path integral for a multi- 
spin system gives a realistic realization of Non-Linear Sigma Model which is an important field 
theory model in condensed matter physics and particle physics. Thus, before I go to discuss the 
general physics implication containing in the geometrical structure of coherent states, it may be 
useful to illustrate first the spin coherent state in details. 

Let us start with a simple example: a spin- 1/2 particle in a varying magnetic field: B{t) = 
{Bx{t),By{t)^Bz{t))^ described by the Hamiltonian, 

H{t) = -pS-Bit). (82) 

Here p is the particle’s magnetic moment, S = {Sx,Sy,Sz) the spin operator that satisfies the 
usual angular momentum commutation relationship. The evolution of system governed by (82) 
can be determined by the Schrodinger equation, whose general solution can be written as 

\m)=c^it)\i)+m\t)- ( 83 ) 

Substituting (83) into Schrodinger equation, it is easy to determine the time-dependence of the 
parameters a(t) and /?(t). However, in order to derive the spin coherent state, here, I may only 
concentrate on the structure of the state (83). The normalization of (83) results in a constraint on 


the parameters a{t) and ^{t): 

+ = ( 84 ) 

If I parameterize ,0 = sin Eq. (83) can be expressed as 

|^(t)) = ( cos ^ + sin Me-»^W5+) | |)e'^W, (85) 

the raising and lowing spin operators are defined by = iSx i iSy, and J,) = | 1 ') ) 
5 -| -f) = I 4 .). Furthermore, one can easily show that 

( cos ^ -h sin I 4-) = exp | 4-) = (86) 

Then, (85) can be simply expressed as 

\m = . (87) 

The state \9(p) is a standard expression of the spin coherent states. 



312 


Wei-Min Zhang 


As I mentioned a very important property of the spin coherent state is that \6if) is embedded in 
a topologically nontrivial geometrical space, i.e., a two-dimensional sphere 5^. This can be varified 
directly from (85). Since Si = ai/2 where ai is Pauli matrix, 

Ds{ri) = exp 1 775“^ - 77*5“ | = exp 

r cosi77| ||sin|77| ’ 

[ -f57ysin|77| cosiTyi 

and a:o is real while x = Xi Also, Dgiv) is a unitary operator which leads to 

xl -f -f rci -h :r2 = 1. (89) 

In other words, the parameter space of Ds{r]) is a two-sphere 5^ with rj — Therefore, the 

spin coherent states are ono-to-one corresponding to the points on 5^ except for the north pole 
where it is ambiguous since all values of cp correspond to the same point. 

However, in defining the above spin coherent states, there is an ambiguous. For example, one 
can also define the spin coherent states as 

\etpy = ( cos ^6“^ + sin I 4-), (90) 

which are also ono-to-one corresponding to the points in 5^ but except for the south pole. These 
two spin coherent states are related simply by a phase factor, 


0 

-7/* 


Xo X 
-X* Xo 


( 88 ) 


\d<py=e^^\0cp). 


(91) 


Geometrically, these two coherent states define the two “patches” of 5^. Since these two states 
are only different by a phase factor, quantum mechanically, they must be equivalent. This implies 
that there is a gauge degree of freedom in the spin coherent states. 

To see clearly the physical implication induced by the topological structure of spin coherent 
states, one can construct the path integral of a quantum spin system. The spin coherent state also 
obeys the overcompleteness: 

ldfiiecp)\eip){9<p\=i, ( 92 ) 

where dii{d(p) = sin^d6Gi(/p/27r is an invariant measure on 5^. Then, it is easy to show that the 
path integral of quantum mechanics for if = iif (5) is given by 


= I wm] exp {i ly dt[(eip\ij^\e^} - {ecp\H0)\9ip)] }. (93) 


In this path integral, the first term in the exponent. 


uj[6(p]= f dt{0ip\i-^\6<p) = f -(1 — cos0)d(p, (94) 

Jto dr 2 

is pure geometric that only depends on the trajectory over the sphere, but not on its explicit time 
dependence. For a closed path, u[d(p] is actually a Berry phase of the spin history [43]. Therefore, 
uj[6ip] is a gauge invariant one-form defined on the sphere 5^: 

p(Pf p(pf 

cj[9ip] = A(n)dn, (95) 

J IPQ J ipo 

where n = (sin0cos(,p,sin0sin<,p, cos^) is a unit vector, and ^(n) is a unit vector potential. Com- 
pare with (94) and (95), one can find that 



Coherent States in Field Theory 


313 


This vector potential has one singularity at the south pole. It is this singularity where the Dirac 
string which carries the magnetic monopole flux enters the sphere. Hence, ,4“ is nothing but a 
gauge potential of Diracs magnetic monopole. Similiarly, for the spin coherent state \0(py, the 
corresponding gauge potential is 




1 4- cos6 ^ 
2sin^ 


(97) 


A and A define the two non-singular patches of the monopole section. Their difference is a pure 
U(l) gauge in the overlapping equatorial region, 


^l>=A'^ + d(p = Al- ig ^dg. 


(98) 


where ^ = 6**^6 U(l). 

The existence of the above gauge degrees of freedom can be understand clearly by looking at 
the general definition of coherent states based on group theory [9, 10]. In group theory, quantum 
states of a spin system form a unitary representation of the SU(2) group, here s is an arbitrary 

spin. Choosing a fixed state, such as the lowest-weight state = |5 - s) G V®, one can define 

spin coherent states as 

\9)s=9\s~s), geSU{2). (99) 

In general, g = exig{iaSa:)exig{ifiSy)exig{ijSz) = exp(fe'-^‘^5+ - fe^^5”)exp(^Y5^). Note that 
this decomposition, called the Baker-Campbell-Hausdorff formula, is unique. As a result, one can 
rewrite the above spin coherent states as 

Is). = exp(^e-‘'^5+ - exp(zy5.)|s, -s) = (100) 

where 

\6^) = exp(^e-*'^5+ - y^S-)\s, -s) (101) 

is the standard definition of spin coherent state for an arbitrary spin s [42]. Spin s = 1/2 discussed 
above is a special case. As one can see, apart from a phase factor, the spin coherent states can 
be generated by a unitary spin rotational operator acting on the fixed state |s - 5 ). The unitary 
operator expde”^'^^"^ - is the coset representation of the space SU{2)JU{1) S^. 

Therefore the sphere 5^ determines the topological structure of spin coherent states. The magnetic 
monopole potential A^’ defines a [/(I) fibre bundle over this sphere 5^. Meanwhile, the spin 
coherent states contain an arbitrary phase x* quantum mechanics, a quantum state is specified 
up to a phase factor, namely, physics is invariant for different choices of such a phase factor so that 
this phase factor is usually ignored. However, when quantum states are embedded in a topologically 
nontrivial space, this phase freedom is indeed the associated gauge degrees of freedom of the fibre 
bundle over the space. In spin coherent states, the phase x is just the gauge degree of freedom that 
connects different choices of magnetic monopole potentials. Ignoring (or fixing) this phase factor 
corresponds to a gauge fixing. 

Furthermore, the topological properties of the spin coherent states also play an important 
role in the study of spin dynamics. A typical example is the Heisenberg model in condensed 
matter physics. Heisenberg model is used to understand quantum magnetism of strongly correlated 
electron systems. The model Hamiltonian considered here is very simple: 

= ( 102 ) 

{».i) 

which describes a many-spin (each spin = s) system with the nearest neighbor exchange interaction. 
“Classically”, the ground state of the above Hamiltonian is easily determined. When J > 0, 
the minimum energy is given by the state in which the nearest-neighbor spins are always anti- 
alignment. These states are called in Uteratures the Neel states. Correspondingly, the system is 
an antiferromagnet. If J < 0, the ground state is simply given by the state with all spin aligned in 



314 


Wei-Min Zhang 


the same direction, which is a ferromagnetic state. These consequences can be obtained explicitly 
by taking the spin coherent state 

m^i}) = IJexp - a) (103) 

i 

as a trial wave function and minimizing the model Hamiltonian 

= 6{Js^ ^[cos0icos0j +sm6iSm9jCOs{ipi - ipj)]) = 0. (104) 

(ij) 


The resulting ground state is given by 

{ J > 0 — > = TT — 4“ TT antiferromagnet 

j < 0 -4 OiJ^i = 6i,(pi^i = (pi ferromagnet 


(105) 


An important concept one can obtain from the above result is that the ground state spontaneously 
breaks the global spin rotational symmetry. As one can check the model Hamiltonian is invariant 
under global spin rotational transformations: T = exp(ia* S), where S = S^. While the ground 

state energy does not change when all the spins in (105) are globally rotated. This leads to a 
SO (3) degeneracy of the ground states, namely these ground states have a lower symmetry than 
the Hamiltonian. Such a situation is called the spontaneously symmetry breaking. Quantum 
mechanically, it leads to gapless spin-wave excitations in the Heisenberg model, and Goldstone 
bosons in general. 

The quantum dynamics of interacting spins can be studied from the time-evolution of the 
system at zero-temperature. The time-evolution is determined by the Green’s function which is 
defined by the matrix element of the evolution operator between two spin coherent states: 

Gitf,to) = JJ[d/i(0i<Pi)]exp|i5[0i(f),<,Pi(f)]|, (106) 

here, the effective action is given by 

- ^3 GOS(<^t — )]] (107) 


Note that the thermal dynamics can be obtained in a similar form. The partition function can be 
expressed in terms of a spin coherent state path integral as well: 

2(13) = {{em}\exv{-m]m>Pi}) 

= J [dK^i ir)(pi (r))] exp | - (r) , ipi (r)] | (108) 

with 

= J dr^-is y^(l — cos Oj) 

+ ^[cos^icos^j -f- sin^isin^j cos{(pi — pj )]^ , (109) 

(iJ) 


S[0i{T),pi{T)] 



Coherent States in Field Theory 


315 


where r is an imaginary time from 0 to /3 = 1/kT. 

Eq. (107) shows that spin dynamics is induced by the geometrical phase u = f dt ^ - s(l — 
cos 6i)ipi which contains the time derivative and therefore leads to the equation of motion for 
It also shows that the magnetic monopole potential is actually the conjugate 
momentum of (pi in spin dynamics. If one defines the generalized position and momentum coordi- 
nates by 

Qi = ^i , Pi- 5(1 - cos^i), ( 110 ) 

and expands the effective action S[6i{t),(pi{t)] around the ground state [given 

by Eq. (105)]: 


Qi -Qi + Sqi , Pi - Pi A Spi , 

Qi+i = + Sqi^i , pi^i =Pij^i -f Spi^i , (111) 

up to the quadratic terms, one can determine explicitly the disperson of spin- wave excitations 
[44, 45]. 

However, since Pi is related to the magnetic monopole potential, the conjugate coordinates 
{QiiPi} used above are indeed gauge dependent quantities. To explore the possible topological 
effects in spin dynamics, it is better to use a gauge invariant formulation. This may be done by 
using a global notation n (a unit vector) to represent the spin direction, without specifying the 
parameterization of the sphere by 9,(p. Then, the Green’s function can be expressed as: 


G(t/, to) = f [dp(ni)] exp f di Ai • - Js^ ^ • nJ 1. (112) 


Taking the continuum limit 


m Cin(xi), 


where Ci = 1 (e*^‘‘^) for the ferromagnet (antiferromagnt), |n(xi)| = 1, andir 
Then the Green’s function can be expressed as 


(113) 

(7r,**-,7r),Xi G 


G{tf,to) = J [d/i(n(x)] exp jz J d^+^£(n)|, 


(114) 


where £(n) is an effective Lagrangian. In the low energy (long wave-length) limit, it is reduced to 
the Lagrangian of the Non-Linear Sigma Model in d-f 1-dimensional space [46], 

£(n) = ;la.n-a^n+--- (115) 

where “• • •” dnotes the high order derivatives. This Lagrangian ensures the existence of gapless 
spin- wave excitations. Such a Non-Linear Sigma Model has been widely studied in condensed 
matter physics, including the problems in quantum magnetism, quantum Hall effect and disorder 
dynamics. 


7 Generalized Coherent States and Nonabelian Gauge Fields 

In this last section, I will discuss the generalized coherent states and their potential applications 
in field theory. Generalization of coherent states is based on group theory developed by Perelomov 
and also Gilmore [9, 10]. The spin coherent state discussed in the previous section is an example 
of such generalization. Actually, Glauber had pointed out in his seminal paper [7] that the photon 
coherent states can be constructed starting from any one of three mathematical defimtions. 

• Definition 1. The coherent states \z) are eigenstates of the annihilation operator a: 


a\z) = z\z). 


(116) 



316 


Wei-Min Zhang 


• Definition 2, The coherent states \z) are quantum states with a minimum uncertainty rela- 
tionship: 

(117) 

• Definition 3. The coherent states \z) can be obtained by applying a displacement operator 
D{z) on the ground state of harmonic oscillator: 

\z) = D{z)\0) , D{z) = exp{za^ - z*a). (118) 

We have analyzed these definitions and pointed out [Zha90] that the generalization of eigenstates of 
the lowering operator is not always possible. Indeed, the adoption of this definition to generalize the 
coherent state concept has two major drawbacks: a). Coherent states cannot be defined in Hilbert 
spaces of finite dimensionality in this way, as we have seen for the spin systems, b). The states so 
defined do not correspond to physically realizable states, except under special circumstances that 
the commutator of the annihilation operator (or lowering step operator) and its hermitian adjoint 
is a multiple of the identity operator. Therefore, under this condition one restricts oneself to the 
bosonic field. As a result, the generalization based on definition 1 to other dynamical systems is 
not always applicable. 

On the other hand, the generalization based on the definition 2 is by no means unique. The 
bosonic (photon) coherent states are the minimum uncertainty states essentially because they 
are non-spreading wave packets. Although the minimum uncertainty states are physically very 
interesting, the generalization along this direction has several limitations: a). These coherent 
states can only be constructed for the classically integrable systems in which there exists a set 
of canonical coordinates and momenta such that the respective Hamiltonians can be reduced to 
quadrature. This condition requires a flatness condition on the operator algebra which reduce 
the commutation relations to those of the standard bosonic creation and annihilation operators, 
b). The wave packets with the minimum uncertainty are not unique. Different ones may have 
different properties. Also, such states may be incomplete, or even if they are complete it is not 
certain that the standard form of a resolution of unity exists. Thus, minimum uncertainty states 
appear to have few, if any, useful properties. 

In literatures, the realization of generalized coherent states are indeed achieved based on dis- 
placement operators. The basic theme of this development was to intimately connect coherent 
states with dynamical symmetry groups of a physical problem. Since all physical problems formu- 
lated in quantum theory have a dynamical group (although sometimes the group may be too large 
to be useful), an important outcome of this recognition is that coherent states can be generalized 
to all the quantum problems. 

I should outline here a generalization procedure how an arbitrary coherent state can be gener- 
ated by displacement operators. Consider a set of operators {Ti} closed under commutation: 

[Ti , T,] = TO'j - TjTi = CtjTk , (119) 

k 

That is, {Ti} span a algebra g, and Cij in (119) are structure constants of g. If g is a semisimple 
Lie algebra, it is more convenient to write {Ti} in terms of the standard Cartan basis {Hi^Ecx^E^ = 
E-a}: 

[Hi , Hj] = 0 , [Hi, E,] = aiEa^ 

[E(,^, E-q,] = Hi , [Ea , Ejg] = NapEa-^/S- (120) 

In quantum theory, for such a given set of closed operators {TJ, the quantum states are described 
by a Hilbert space which is a representation of g. Let G be the covering group of g. The Hilbert 
space carries a unitary irreducible representation of G. One may choose a normalized state 
|(;io) in the Hilbert space as a fixed state. Then the generalized coherent state is generated by 
an element g G G acting on the fixed state |(^o)- 

I^)g =g\(l>o}’ 


( 121 ) 



Coherent States in Field Theory 


317 


In group theory, every element g E G can be uniquely decomposited into a product of two group 
elements: g = kh, here one should require h e B. such that 

hIM = \<t>o)e^^, ( 122 ) 

and H is the maximum subgroup of G that leaves the fixed state invariant up to a phase factor. 
While k is an operator of the coset space G/H. If G is a semisimple Lie group and \(/)o) is the 
lowest weight state, k can be generally written as 

k = Dg{t]) = exp I y^iriaEa - € G/H. (123) 

a>0 

This operator Dg{t)) is usually called a displacement operator of G, which gives a coset represen- 
tation of G/H. As a result, 


\9)g = (124) 

Perelomov and Gilmore [9, 10] define the state \^[Z)) as the generalized coherent states of G: 

|#(Z)) =r»G(7?)|(/>o) = V(Z)exp| (125) 

a>0 

and M{Z) is a normalized constant. The generalized coherent states defined in such a way have 
two important properties 

• The set of the generalized coherent states satisfies: 

I dti{z)mz))mz)\ = I, ( 126 ) 

where djjL{Z) is the G-invariant Haar measure on G/H. 

• The generalized coherent states are one-to-one corresponding to the points in the coset space 
G/H except for some singular points, such as the north pole or south pole of the two-sphere 
in spin coherent states. Therefore, the generalized coherent states are embedded into a 
topologically nontrivial geometrical space. 

Systems discussed in the previous sections are only some simple examples of the generalized 
coherent states. The harmonic oscillator admits a dynamical group ^ 4 , called Heisenberg- Weyl 
group. The photon coherent states are obtained via a one-to-one correspondence with the ge- 
ometrical coset space H(4)/U(l)xU(l) (a complex plane) by the displacement operator D{z) G 
H(4)/U(l)xU(l). The two-photon processes has a SU(1,1) dynamical group. The squeezed states 
are obtained by the displacement (squeezed) operator Dsqi/d) G SU(1,1)/U(1) (a two-dimensional 
hyperboloid space). And the spin coherent states discussed in the previous section are generated 
by the displacement operator Ds{0ip) G SU(2)/U(1) (a two-dimensional sphere). 

I should emphasize here that the phase x in the group-theoretical coherent state (124) is the H- 
gauge degrees of freedom over the coset space G/H. All the three sets of coherent states discussed 
in the previous sections contain an U(l) gauge, but only the sphere (spin) carries a nontrivial fibre 
bundle so that the gauge degrees of freedom become important. To obtain a non-abelian gauge, 
one must consider the generalized coherent states of a group G whose rank is larger than one such 
that H can be a non-abelian group. 

To examine non-abelian gauge degrees of freedom in the generalized coherent states, one may 
extend the path integral to the generalized coherent state representation. The Green’s function is 
now defined as the matrix element of the evolution operator in the generalized coherent states: 

G{tfM) = ($'(^)|Texp| - 2 ^ if(t)dt||$(Z)). 


(127) 



318 


Wei-Min Zhang 


Following the same procedure as it has been done in the previous sections that divides the time 
interval tf - to into N intervals', each with e= [tf — to)/N, then inserts the resolution of identity 
(126) at each interval point, and finally lets N go to infinity, the Green’s function can be expressed 
as a generalized coherent state path integral. 

f ^ 

G{tf,to) = lim^ J ( n 

i=l i—1 

= J exp [iS[Z{t )\ } , (128) 

where 

SlZ(t)] = j'j dt[{HZ{t))\i^\HZm - {$(Z(t))|if(t)|$(Z(i)))} (129) 

is an effective action in the generalized coherent state representation. This path integral is defined 
over the coset space G/H. The effective action contains two terms. The second term is the matrix 
element of Hamiltonian operator in the coherent states, which determines the static properties of 
the classical Hamiltonian. The first term is pure geometric, and it is indeed a Berry phase [47, 48] 
that describes quantum fluctuations, and also determines the time-evolution of the system, 

u[G/B.]= [ mZ)\d\^(Z)) = [ A‘dCl, (130) 

JveG/n JreG/H 

where ^ is a gauge vector potential defined over the coset space G/H, and is a unit vector in 
G/H. One can then define the gauge connection, 


F = (d$(Z)|d$(2))) = 53 A dZa' (131) 

aa' 


and ujaoc^ is the Berry curvatures: 


_ /d^z) 

\ dZo, 


d^z) \ / d^z) 

dZa' / \ dZa' 


a$(z)\ 

dz^ /• 


(132) 


When the rank of G is larger than one, the associated gauge potential A is non-abelian with gauge 
group < H. From the above generalized coherent state path integral, one can study the so-called 
geometric quantization [50, 51] and classical gauge equations of motion in quantum mechanics 
[52, 53]. 

This path integral formulation and the associated gauge potentials have potential applications 
in condensed matter physics and particle physics. This is because classical semisimple Lie groups 
can be generated by bilinear operators of bosonic and fermionic creation and annihilation operators. 
The bilinear operators describe the basic collective excitations in strongly correlated or strongly 
coupled systems. Therefore, the above formalism can be applied directly to various realistic physical 
problems. 

Specifically, the SU(n) group can be generated by the particle-hole pairs: {a\aj^ 1 < < n}, 

and the corresponding generalized coherent state is given by 


\{Zij]) = M{Z) exp { ^ Zijalaj]\m). 
ij 


(133) 


where a|, can be either bosonic or fermionic creation and annihilation operators, and \m) contains 
m particles in the lowest states, m < n. For bosonic system, the coherent states are defined 
on the coset pace {Zij} € SU(n)/SU(n-l)xU(l). For fermionic space, the coset space {Zij] is 
SU(n)/SU(n-m) xU(m). The spin coherent state of the Heisenberg model discussed in the last 
section is a special case where the spin operators take the form: 



Coherent States in Field Theory 


319 


a is the Pauli matrix, and a,l3 denote the spin index of electrons. Let Zu = tan f the 
spin coherent state (103) can be reduced to the form of (133) with Zij =0foi i^j and A/'(Z) = 
1 / {ld-\Zii\^Y . Correspondingly, the geometrical space SU(n)/SU(n-m) xU(m) is reduced to IZu} G 
a®SU,(2)/Ui(l). 

The Sp( 2 n+ 1 ) group can be realized by, bosonic particle-particle and particle-hole pairs: 
{al,ai,alapaiaj,alaj — 5 %}. The generalized coherent state of Sp( 2 n-l-l) is given by 


\{zi, Zij}) =J\f{Z) exp I ^(zitt] - z;ai)| exp | ^ \Zijala]'^\0). (135) 

i ij 

The corresponding coset space {zi^ Zij} is SP(2n-|“l)/U(n). The squeezed coherent states discussed 
in Sec. V are only special cases of the above coherent state. 

The SO( 2 n) group can be generated by fermionic particle-particle and particle-hole pairs 
{clcpdCj, c\cj - \5ij]. Similarly, one can write down the most general coherent state for SO( 2 n): 

\{Zij]) = M{Z) exp { ^ Zyc|c]}|0), (136) 

ij 


and its geometrical space is the coset space {Zij} e SO(2n)/U(n). A typical example of the above 
coherent states is the BCS superconducting state in which only special fermionic pairs, i.e., Cooper 
pairs are considered [54]: 

|BCS) = -^=^^exp{^hfcci/_,,}lO). (137) 

Since ^ - 1} span a su( 2 ) algebra, the geometrical space of the 

above BCS states is indeed the same as the spin coherent state in Heisenberg model, i.e., {hk} € 
Oa; <^SUa;(2)/Ua;( 1). Therefore, the BCS state carries a U(l) gauge degree of freedom. But phys- 
ically, the superconductivity is very different from the ferro and antiferro-magnetism. This is 
because the Heisenberg model has a global spin rotational symmetry, while the BCS Hamiltonian 
only has a global U(l) symmetry. In the Heisenberg model, the spontaneously breaking of spin 
rotational symmetry leads to the spin-wave excitations which can be described by the Non-Linear 
Sigma Model derived from the spin coherent state path integral, as we have discussed in the 
previous section. In the BCS theory, the spontaneously breaking of the U(l) symmetry for the 
pairing coherence gives pair excitations which can be described by Ginzberg-Landau theory. The 
Ginzberg-Landau theory should also be derivable from the path integral of BCS coherent states. 

The above general fermionic pairing coherent states can also be applies to systems other than 
the conventional BCS superconductivity. For example, if I take the triplet pairs: 

= 5 D 4a(*^^2)Q;/3cLji,^ , f{k) = ^ XI ^-ka{-iO2^)a0Ck0 (138) 

aP otp 

together with the charge and spin operators: 

Q(k) = 5 Y^iciaCka + clj,„c_pa.) “ 1 , ^(p) = \ + C^-koc^apC-kp), (139) 

^ a aP 

which generates a SO (5) group, I can construct a generalized coherent state for the triplet pairing 
for superfluid ^He: 

|SF) = N{Z) exp ■ THfc)}|0). (140) 

Its coset space is {Zk} 6 11* ®SOfe(5)/Uife(2). This SO(5) coherent state carries a non-abelian 
SU(2) gauge. One can use this SO(5) generalized coherent state to study non-abelian gauge fields 



320 


Wei-Min Zhang 


and low energy effective theory for superfluid ^He atoms [55]. Recently, I also constructed a 
generalized pairing state to include the singlet and triplet pairs, 

IZW) = J\r(^)n^Xp{2i(k)4^clk^ + Z2(k)4^Q^cLk+Q4 

+ ■^3(k)4t^-k+Qi + ■^4(k)cJ.j^cl.i^^.Q.j. 

+ Z5(k)c^^cLk^q^ + Z6(k)c|,j^cik+Q|}|0), (141) 

for the study of high Tc superconductivity and the close proximity between the Mott insulating 
antiferromagnetic order and ci«wave superconducting order in cuprates [64]. Here the coset space 
is Ha; U nder the constraint of non-double, occupied sites, possible gauge group 
contains in the above coherent pairing states may be SU(2) xU(l) or a larger one up to U(4). This 
may open a new window for the study of the dynamical mechanism of high Tq superconductivity. 

If one takes the continuum limit in the coordinate space and lets to —oo, tf oo, then the 
path integral based on the generalized coherent states can be expressed as 

G = y' [d/i(i7(x))]exp j d^'^^x^A{x) • Cl{x) - (142) 

where a: is a coordinate in the d + 1 dimensional Minkowshi space. If the Hamiltonian has a 
symmetry S C G, and the static classical ground states (which can be obtained by minimizing Ti 
with respect to the coherent state parameters) spontaneously breaks this symmetry, then one can 
use the saddle-point expansion to derive a general Non-Linear Sigma Model defined on G/H, 

G = y'[dpt(n(x))]exp j • d^Cl{x) H h ©Topj? (143) 

to describe the low energy physics in the long-wave length limit, where {• • •} denotes the higher 
order derivatives in the Non-Linear Sigma Model, and ©Top is a topological phase, corresponding 
to a Wess-Zumino-Witten topological term [61, 59, 60] that is induced by the gauge degrees of 
freedom contained in the generalized coherent states and/or a Chern-Simons term of topological 
gauge fields over the coset space G/H [68]. There may exist many potential applications of such 
a Non-linear Sigma Model in real physical problems, such as quantum Hall effect [60, 62], the 
high Tc superconductivity [63, 64], the disorder systems [65] in condensed matter physics. It is 
also possible to apply such theory to quantum chromodynamics in particle physics when quantum 
chromodynamics is formulated in lattices [66, 67]), and quantum gravity [68], etc. 

8 Summation 

In summation, as I have emphasized throughout the article, coherent states possess three unique 
properties that are fundamental to field theory. The property of coherent behavior uniquely de- 
scribes the processes involving infinite number of virtual particles. The coherent excitations ob- 
tained from coherent states also give the essential physical picture of long-range orders induced by 
strong correlations. The property of over completeness provides a reformulation and generalization 
of the functional integral in field theory, in which quantum fluctuations of composite operators 
are included as new low energy dynamical field variables. One may thereby be able to determine 
the dynamical degrees of freedom in different energy scales and to derive the corresponding effec- 
tive theory. The property of topologically nontrivial geometrical structure in generalized coherent 
states allows one to explore the origin of gauge fields and associated gauge degrees of freedom. In 
this article, I have not touched the recently development of coherent states in terms of superalge- 
bras and quantum groups. These topics may also be very important in the modern development 
of field theory, such as in supersymmetry, superstring and conformal field theory. Nevertheless, in 
my personal opinions, understanding the origin as well as the nature of gauge degrees of freedom 
in physics is perhaps the most fundamental problem in field theory. 



Coherent States in Field Theory 


321 


References 

[1] E. Schrodinger, Naturwissenshaften, 14, 644 (1926). 

[2] T. D. Lee, F. E. Low and D. Pines, Phys. Rev. 90, 297 (1953) 

[3] P. W. Anderson, Phys. Rev. 110, 827 (1958). 

[4] J. G. Valatin and D. Butler, Nuovo Cimento, 10, 37 (1958). 

[5] J. Schwinger, Proc. Nat. Acad. Sci. 37, 452, 455 (1951); Phys. Rev. 92, 1283 (1953). 

[6] J. R. Klauder, Ann. Phys. 11, 123 (1960). 

[7] R. J. Glauber, Phys. Rev. 130, 2529 (1963); 131, 2766 (1963). 

[8] E. C. G. Sudarsgan, Phys. Rev. Lett. 10, 277 (1963). 

[9] A. M. Perelomov, Commun. Math. Phys. 26, 222 (1972). 

[10] R. Gilmore, Rev. Mex. de Fisica, 23, 143 (1972). 

[11] J. R. Klauder and B-S. Skagerstam, “Coherent States, applications in physics and mathemat- 
ical physics” (World Scientific, Singapore, 1985). 

[12] A. M. Perelomov, Generalized Coherent States and Their Applications, (Spring- Verlag, Berlin, 
1986). 

[13] W. M. Zhang, D. H.'Feng and R. Gilmore, Rev. Mod. Phys. 62, 867 (1990). 

[14] B-S. Skagerstam, in the Proc. “Coherent States: Past, Present and Future”, Ed. by D. H. 
Feng, J. R. Klauder and M. R. Strayer (World Scientific, Singapore, 1994). 

[15] J. R. Klauder and E. C. G. Sudarshan, “Fundamentals of Quantum Optics” (W. A. Benjamin, 
New York, 1968). 

[16] V. Chung, Phys. Rev. 140, BlllO (1965). 

[17] M. Greco and G. Rossi, Nuovo Cimento, 50, 167 (1967). 

[18] T. W. B. Kibble, J. Math. Phys. 9, 315 (1968); Phys. Rev. 173, 1527; ibid. 174, 1882; ibid. 
175, 1624 (1968). 

[19] P. P. Kulish and L. D. Faddeev, Theor. Math. Phys. 4, 745 (1970). 

[20] D. Zwanziger, Phys. Rev. Dll, 3481; 3504 (1975). 

[21] N. Papanicolaou, Phys. Rep. 24, 229 (1976). 

[22] D. R. Yennie, S. C. Frautscji and H. Suura, Ann. Phys. (N. Y.) 13, 379 (1961). 

[23] S. Weinberg, Phys. Rev. 140, B516 (1965). 

[24] K. E. Eriksson, N. Mukunda and B. S. Skagerstam, Phys. Rev. D24, 2615 (1982). 

[25] T. D. Lee and M. Nauenberg, Phys. Rev. 133, B1549 (1964). 

[26] G. Sterman, “Introduction to Quantum Field Theory” (Cambridge, 1993). 

[27] R. P. Feynman, Rev. Mod. Phys. 20, 367 (1948). 

[28] R. P. Feynman and A. R. Hibbs, “Quantum Mechanics and Path Integrals” (McGraw-Hill, 
New York, 1965). 



322 


Wei-Min Zhang 


[29] L. D. Feddeev, in “Methods in Field Theory”, Eds. R. Balian and J. Zinn- Justin (North- 
Holland/Worls Scientific, Amsterdam, 1976) 

[30] J. R. Klauder, Phys. Rev. D19, 2349 (1979). 

[31] Y. Ohnuki and T. Kashiwa, Prog. Theor. Phys. 60, 548 (1978). 

[32] E. Aders and B. W. Lee, Phys. Rep. 9C, 1 (1973). 

[33] L. D. Feddeev and A. A. Slavnon, “Gauge Fields - Introduction to Quantum Theory” (W. A. 
Benjamin, New York, 1980). 

[34] T. T. Wu and C. N. Yang, Phys. Rev. D12, 3845 (1975). 

[35] Y. P. Yao, Phys. Rev. Lett. 36, 653 (1976). 

[36] C. Fabre, Phys. Rep. 219, 215 (1992). 

[37] D. F. Walls, Nature, 306, 141 (1983). 

[38] R. Jackiw and A. Kerman, Phys. Lett. A71, 158 (1979). 

[39] F. W. C um mi n gs and J. R. Johnston, Phys. Rev. 151, 105 (1966). 

[40] Y. Tsue and Y. Fujiwara, Prog. Theor. Phys. 86, 469 (1991). 

[41] K. G. Wilson and J. Kogut, Phys. Rep. 12C, 75 (1974). 

[42] F. T. Arecchi, E. Couttens, R. Gilmore, and H. Thomas, Phys. Rev. A6, 2211 (1972). 

[43] M. V. Berry, Proc. R. Soc, London, Ser. A392, 45 (1984). 

[44] D. C. Mattis, “Theory of Magnetism” (Spring- Verlag, 1988) 

[45] R. M. White, “Quantum Theory of Magnetism” (Spring- Verlag, 1987). 

[46] F. D. M. Haldane, Phys. Lett. A93, 464 (1983); Phys. Rev. Lett. 50, 1153 (1983). 

[47] B. Simon, Phys. Rev. lett. 51, 2167 (1983). 

[48] F. Wilczek and A. Zee, Phys. Rev. Lett. 52, 2111 (1984). 

[49] A. Shapere and F. Wilczek, “Geometric Phases in Physics” (World Scientific, Singapore, 
1989). 

[50] N. M. J. Woodhouse, “Geometric Quantization” 2nd Ed. (Oxford, 1992). 

[51] A. Aleksev, L. Feddeev and S. Shatashvili, J. Geom. Phys. 3, 391 (1989). 

[52] S. K. Wong, Nuovo Cimento, A65, 689 (1970). 

[53] A. P. Balachandran, G. Marmo, S.-B, Skagerstam and A. Stern, “Classical Topology and 
Quantum States” (World Scientific, Singapore, 1991). 

[54] J. Bardeen, L. N. Cooper and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957). 

[55] A. J. Leggett, Rev. Mod. Phys. 47, 331 (1975). 

[56] G. E. Volvik, “Exotic Properties of Superfluid ^He” (World Scientific, Singapore, 1992). 

[57] W. M. Zhang, cond-mat/9907287. 

[58] E. Witten, Commun. Math. Phys. 92, 455 (1984); ibid. 137, 29 (1991). 

[59] P. B. Wiegman, Phys. Rev. Lett. 60, 821 (1988). 



Coherent States in Field Theory 


323 


[60] M. Stone, Phys. Rev. Lett. 63, 731 (1989); Nucl. Phys. B227, 399 (1989). 

[61] E. Witten, Commun. Math. Phys. 121, 351 (1989). 

[62] M. R. Zirnbauer, hep- th/9905054, and references therein. 

[63] S. C. Zhang, Science, 275, 1196 (1997). 

[64] E. Demler and S. C. Zhang, Ann. Phys. 271, 83 (1999). 

[65] K. Efetov, “Supersynmetry in Disorder and Chaos” (Cambridge, 1997). 

[66] K. G. Wilson, Phys. Rev. DIO, 2445 (1974); and In “New Phenomena in Subnuclear Physics” , 
Ed. A. Zichichi (Plenum Press, New York, 1977). 

[67] S. R. Sharpe, hep-lat/9811006. 

[68] E. Witten, Nucl. Phys. B323, 113 (1989); Phys. Rev. D44, 314 (1991). 



12. Panchar at nam, Bargmann and Berry Phases - 

A Retrospective 

N. Mukunda ^ 

Centre for Theoretical Studies and Department of Physics, 

Indian Institute of Science, Bangalore 560 012, India 


1 Introduction 

Berry’s discovery in 1983-84[1] of “quantum adiabatic anholonomy” has been the starting point and 
inspiration for an enormous amount of work exploring its properties, generalisations, reformulations 
and applications [2]. Now known as the geometric phase, Berry himself has characterised it as 
something that helped clear up “a corner of quantum mechanics that was for a long time dusty 
and obscure” [3]. It seems fair to say that this concept has achieved much more, and has shown 
astonishing conceptual depth and unifying power. 

While various generalisations of the original formulation came soon after Berry’s initial work, 
it was also discovered in due course that there were several precursors to his ideas. In retrospect, 
two of these seem specially significant - work by S.Pancharatnam in 1956 in the arena of classical 
polarization optics[4], and by V.Bargmann in 1964 in the context of Wigner’s theorem on the 
representation of symmetry operations in quantum mechanics [5]. In both cases, phases of complex 
quantities played important roles, and today it seems evident that these and the later geometric 
phase all belong to the same circle of ideas. 

The purpose of this paper is to describe these developments in such a way that one sees 
in proper perspective the close connections between the ideas of Pancharatnam, Baxgmann and 
Berry and how they dovetail into one another. The emphasis, admittedly selective, will be on the 
theoretical aspects. In section 2 we briefly review the original derivation by Berry of the geometric 
phase, spelling out the assumptions made on the way. This is then followed by an account of the 
generalisations due to Aharonov and Anandan on the one hand[6], and by Samuel and Bhandari 
on the other [7]. These relax to a great extent the conditions under which geometric phases can be 
defined. Section 3 carries these processes further and exhibits these phases as a part of quantum 
kinematics, namely as reflecting the presence of a complex linear vector space structure as state 
space[8]. One aspect of the close connection between group actions and geometric phases comes 
through in this manner, namely the phase is seen to be the simplest invariant expression under 
certain well defined groups of transformations acting on curves in Hilbert space. Sections 4 and 5 
are devoted to explanations of the links to the two precursors mentioned above, namely to the work 
of Bargmann and of Pancharatnam respectively. These sets of ideas intersect in diverse ways and 
this is brought out in some detail. Section 6 returns to the theme of the connection between groups 
and geometric phases but in a new manner. The focus now is on the structure of geometric phases 
which arise from unitary Lie group representations[9]. While this could be viewed as a particular 
case of the general theory, it merits separate discussion as there is a rich interplay between Lie 
algebraic and differential geometric structures and these are reflected in the final geometric phase 
formula. Section 7 contains concluding remarks. 


* Email; nmukunda@cts.iisc.ernet.in 

^Honorary Professor, Jawaharlal Nehru Centre for Advanced Scientific Research, Jakkur, Bangalore 560064 



PsudhsxcLtnsm, Basginann and Berry Phases 


325 


2 The adiabatic geometric phase and its generalisations 

Consider a quantum mechanical system with state vector 'ip(t) at time t and hermitian time de- 
pendent Hamiltonian operator II (t), evolving according to the Schrodinger equation 

i = smu). ( 1 ) 

(JHere for simplicity we have set Planck’s constant h 1), Assume that the time dependence of 
H(t) is mild and that we are able to apply the adiabatic theorem of quantum mechanics. To this 
end, let u{t) be a normalised nondegenerate instantaneous eigenvector of H{t) with (possibly time 
dependent) eigenvalue E{t): 


H(t)u(t) = E{t)u{t), 

(u{t),u(t)) = ||n(t)|p = l 

Then, provided the overall phase of u{t) at each time is adjusted so that 



= 0 , 


we can construct an approximate solution 'ipadii) of eqn.(2.1) by setting 









( 2 ) 

(3) 


(4) 


The explicit time dependence in H{t) can be thought of as due to the quantum system being 
placed in a slowly changing (classical) environment. Let the environment now be supposed to be 
cyclic, in the ^nse that at time ^ = T it returns to its condition at, say, ^ = 0. This is reflected in 
the operator H (t) similarly being cyclic: 

H{T) = HiO) (5) 


One can now ask for the relation between 'ipadi'I') ^tud ^ad(0)‘ is it also similarly cyclic ? That 
the physical state is cyclic (provided the nondegeneracy condition is obeyed throughout t = 0 to 
t = T) is well known. This means that iJad{T) can differ from ^ad(O) at most by a phase: 

i;adiT) ^ e'notV;,d(0), 

= total phase (6) 

One contribution to this total phase is evident from the adiabatic theorem and the definition (2.4) 
of 'ipadi'^)' ^bis is the dynamical phase 

T 

V’dyn = - 1 dtE{t). (7) 

0 

Berry’s surprising discovery was that there is another geometric contribution </?geom to cptot^ 
namely that 

not = V’dyn + neom, (8) 

where </?geom Is determined in a global way by, or belongs to, the entire path traced out in state 
space by 'ip{t) from t = 0 to t = T. The important point is that this piece y?geom is robust in the 



326 


N Mukunda 


sense that it is unaffected by the various phase choices or freedoms available in the problem. This 
(pgeom is the geometric phase. One can trace it to the fact that if u(t) obeys (2.3) throughout, 
then u{T) and u{0) in general differ by a phase. 

Thus it is clear that the original discovery of the geometric phase was in the context of unitary, 
adiabatic, cyclic Schrodinger evolution in quantum mechanics. It may also be mentioned that 
Berry made es^ntial use of the classical parameter space which accounts for the explicit time 
dependence of H{t)] and the explicit expressions that were developed for (/:?geom were also written 
in terms of line and surface integrals over this space. 

In gradual degrees the original restrictions or conditions under which the geometric phase was 
defined were relaxed or removed. In the work of Aharonov and Anandan[6] it was shown that one 
can give up the adiabatic condition and still define the geometric phase. As long as one has a 
solution of the time dependent Schrodinger equation (2.1), and the solution is cyclic in the sense 
that 


^(T) = e^not^(O) (9) 

whether or not the Hamiltonian obeys the cyclic condition (2.5), one can define the geometric 
phase by 

^geom = ^tot <^dyn’ 

T 

*^dyn = - J dt (10) 

0 

The main point is that the definition (2.7) of the dynamical phase has been generalised beyond 
the confines of the adiabatic theorem; the latter is of course recovered as a special case. More- 
over, v^geom clearly belongs to the particular cyclic solution < t < T, to the Schrodinger 

equation that one has in hand, Aharonov and Anandan also showed that the geometric phase is 
intrinsically something defined for cyclic evolutions in the quantum mechanical state space or ray 
space associated with the Hilbert space, rather than in the classical parameter space. But here 
again one can easily recover the original framework of Berry as a particular case, and that is quite 
often useful. 

The next important extension, due to Samuel and Bhandari[7], showed that the geometric 
phase could be defined for noncyclic and even for nonunitary evolutions. As for the former, the 
key idea was to introduce and exploit the properties of geodesics in quantum mechanical ray and 
Hilbert spaces. Thus a case of noncyclic evolution waus converted to cyclic evolution by joining the 
end points via a geodesic, and then using the earlier definition for (pgeom one has a definition valid 
for noncyclic cases. The case of nonunit ary evolution corresponds to working with a Schrodinger 
equation in which the Hamiltonian may be nonhermitian. 

All these extensions demonstrate the robustness of the geometric phase concept in that it 
seems to require very limited formal machinery to be able to identify and define this phase. This 
feature will be reinforced when we describe the kinematic approach in the next section. On the 
experimental side, many examples of the geometric phase have been demonstrated. Purely for 
illustration we may mention that the early experiment of Chiao et al[10] showed the presence of 
this phase in a classical optical situation; the experiment of Bhandari et al[ll] was in the original 
Pancharatnam framework of polarization optics; while that of Simon et al[13] showed how a time- 
dependent geometric phase could be converted into a frequency shift capable of fine-tuning laser 
beams. 

3 The geometric phase seen via quantum kinematics 

Now we show that one can go one step further in relaxing the conditions necessary for the definition 
of the geometric phase - even the Hamiltonian operator and the Schrodinger equation can he given 



Pancharatnam, Bargmann and Berry Phases 


327 


up, thus showing that this concept depends solely on the existence of a complex linear vector space 
structure. This may aptly be called the quantum kinematic approach to the geometric phase[8]- 
Let H be the Hilbert space describing the pure states of some quantum system - it may be 
of finite or infinite dimension. For the most part all the constructions to follow involve the set 
of unit vectors in namely the unit sphere in H. The basic mathematical objects we deal with 
are continuous, almost everywhere differentiable generally open curves C CT-L parametrised by a 
monotonically increasing parameter s: 

C = {'il;{s)eU\\ms)\\ = l,si<s<S2} ( 1 ) 

The ray space TZ associated to T-L consists of the set of all projection operators p onto unit vectors 

inn: 


= {p= operator on Ti\p^ = p, p > 0,Tr p = 1, p^ = p,} (2) 

Every curve C C n described as in eqn.(3.1) possesses a ray space image C C 1Z, namely a 
continuous almost everywhere differentiable parametrised curve of normalised pure state density 
matrices. Denoting by tt the natural projection H 1Z {as in n^ so with TZ we are mainly 
concerned with normalised p), we have 

c = 7r[C] = {/5(s)|p(s) = 7r(0(s)) = ■^(s)V'(s)^Sl < s < S 2 } C (3) 

For given C CTZ, any C cn such that 7r[C] = C is a lift of C. 

We now define two groups of transformations on such curves C mapping them onto similar curves 
C: the group of local phase changes, and the group of continuous monotonic reparametrisations. 
They act as follows: (a) Local phase changes : 

C = {'0(s)|si < 5 < S 2 } — y C = |'0'(s) = real, Si < s < S 2 | ; (4) 

(b) Reparametrisations : 

C = {'0(s)|si < 5 < S 2 } — > C' = = i^{s)\s' = f{s) = real monotonic , 

= f{si) < < 4 = 1 (^ 2 )} ( 5 ) 

We see that under transformations (a), the ray space image C (and its parametrisation) are left 
intact; while in transformations (b) the points comprising C (and similarly C) are left intact but 
only their parametrisation or labelling gets changed. 

Now we ask for the simplest (real valued) expression or functional of C that is invariant un- 
der both groups of transformations defined above. A little reflection shows that the following 
expression, tentatively denoted by pg , has these properties: 

ipg - 9^dyn’ 

ifp = arg(^(si),i/?(s2)), 

V’dyn = 

Si 

Several remarks are in order at this stage. Since V’(s) is normalised, the inner product {^{s), 
is in any case pure imaginary. While the quantity defined, the (nonlocal) quantity 

(pP, and so (pg, are both defined modulo 27r; this is unavoidable. The two pieces ^fPiPdjn 
individually invariant under the reparametrisation transformations (3.4b); but only them difference 
is invariant under the phase change transformations (3.4a), so they are tied together in this way. 
One could have written expressions exactly like the above but referring to only a (connected) subset 



328 


N Mukunda 


of C, say running from S3 to S 4 where si < S3 < S4 < S2; but then in a sense the whole of C would 
not have been used in the construction. 

It now turns out that the real- valued functional (pg defined in eqn.(3.5) is the geometric phase 
associated with the curve C. One thus has reduced the prerequisites needed for the definition 
of this phase really to the bare minimum. Needless to say, all the earlier definitions are easily 
recovered as special cases; moreover it is clear that the definition of (fg for an open C (“noncyclic 
evolution”) is actually very simple and direct and does not require special considerations. Indeed 
in this kinematic view, the quantity immediately defined is the noncyclic geometric phase, while 
the cyclic case is a simple specialisation. 

The two invariances leading to cpg can now be interpreted. Invariance under local phase changes 
means that pg is actually a functional of the ray space image C = 7r[C] of C, not of C itself. 
Invariance under reparametrisations entitles the use of the term ‘geometric’. With these insights, 
we can rewrite eqn.(3.5) more explicitly, indicating the arguments of each object: 


ipg[C\ = <Pi>[C] - 

ipp[C\ = arg (V’(si),i/^( 52)), 



81 


We can now appreciate: given a ray space curve (7, the calculation of its geometric phase Pg[C] is 
facilitated by going to any lift C of C, calculating more easily the two quantities pp[C]^p^yj^[C]^ 
and then taking their difference - this difference is lift independent and always gives us Pg[C]. 

The subscript Pin ipp stands for ‘Pancharatnam’ - the explanation for this will be given in the 
sequel. Since Pg[C] is independent of the choice of the lift C, it is clear that certain lifts may enjoy 
special properties. Two of these may be mentioned. ‘Pancharatnam lifts’ are lifts C such that the 
term pp[C] vanishes; then the geometric phase reduces to the dynamical part alone, but there is 
still a great deal of local phase freedom in the choice of C. ‘Horizontal lifts’ are lifts C such that the 
other term vanishes, so the geometric phase reduces to the Pancharatnam contribution 

alone. With these lifts the remaining freedom is much more limited. 


4 Geodesics and the Bargmann invariants 


Now we develop the link to the 1964 work of Bargmaiin[5], and show how that is a precursor to 
the geometric phase. For this we begin with the concept of geodesics in ray and Hilbert spaces, of 
course within the families of unit rays and unit vectors. 

Given a curve C C 'H with image C C TZ, we can define a functional L[C] which can be 
interpreted as the length of C. As with the case of Pg[C], it is simplest to compute L[C] using 
some lift C of C, and then recognise that because of the built-in invariances we have a functional 
of C alone. So, given C = {^(5)}, we define to begin with: 


u(s) 

Uxis) 

’tp'is) 


d' 4 >{s) 
ds ' 
u{s) - 


( 8 ) 


The component u±{s) of u(s) is seen to ‘transform covariantly’, ie., in the same way as ^(s) itself, 
under local phase changes. We then set up L[C] as: 


82 82 

L[C] = j dsllux(s)ll = J rf5{||«(s)||2 - |(V.(s),«(5))|2}V2, 


( 9 ) 


and regard this as the length of the ray space curve C. It is clear that it too possesses the 
invariances of PglC], namely local phase change and reparametrisation invariances. It is also seen 



Pancharatnam, Bargmann and Berry Phases 


329 


to be nondegenerate and nonnegative, thus specifying a Riemannian metric on TZ. This is the well 
known Fubini- Study metric. 

We now define geodesics in TZ to be those curves C which, for given end points pi , ^ 2 ? minimize 
the length L[C]: 

pi = '0i'02, p 2 = '02^^! fixed, 5 L[C] = 0 => C is a geodesic from pi to p 2 (10) 


The corresponding Euler Lagrange differential equations can be easily derived, and exploiting 
the invariances of L[C] they can be solved in the most general case[8]. The result of this analysis 
is that the most general geodesic C'geo in ^ can be lifted to a (horizontal) curve Cgeo m H, and an 
(affine) parametrisation can be chosen, so that Cgeo haa the following extremely simple description: 


'geo C 72. 

Cgeo C % : 

cm 

CD 

o 

II 

{V’(s)}, 

V’(s) = 

<Ai cos s + (j >2 sin s, 

{<i>iAi) = 

= 1, {<f>l,4>2) = 0 


( 11 ) 


Thus Cgeo is just an arc of a plane two- dimensional circle in HI (The range of s can be adjusted 
so that Cgeo runs from given pi to given p 2 in TZ). We see that for Cgeo, both c,Pp[Cgeo] and 
(/?dyn[^gco] vanish, so the geometric phase for Cgeo also vanishes: 

Cgeo = geodesic in TZ^ Cgeo = horizontal affinely parametrised lift : 


^p[Cgeo] = ^dyn[%eo]=0, 

^giCgeo] = 0 . ( 12 ) 

This - the vanishing of the geometric phase for any geodesic in 71 - is the key result that will lead 
to the link to the Bargmann invariants. 

In the above we worked with specially chosen and convenient lifts Cgeo of a geodesic Cgeo in 
TZ. It is convenient to say that any lift C of a geodesic in 72. is a geodesic in H: 

S L[C] = 0, 7 r[C] = C C is a geodesic in TZ. (13) 

Then one has a more flexible result than (4.5): 

S L[C] = 0, 7r[C] = C 
^geo[C'] = 0, 

= <Pdyii[C] (14) 

This result will be used below. 

Now we define the Bargmann invariants [5]. Let 01,02, •••50n bo any sequence of n unit 
vectors in U] they do not have to be mutually orthogonal. Their images in TZ will be denoted as 
Pi, P 2 , . • • , Pn- The 71 “ vertex Bargmann invariant is then defined as the product of a sequence of 
scalar products 

An (01,02, ••• ,0n) = (01, 02) (02, 03) ••• (0n-l50n)(0n,0l) (f^) 

For definiteness, to avoid trivialities, we assume all the factors are nonzero. We see that this 
expression is invariant under independent phase changes in each of the vectors 0i , . . . , 0n involved 
in its construction. Thus it must really be expressible as a ray space quantity; this is made explicit 
by writing it as 


An(01,02,-“,0n) = Tt (piP2 • • • Pn) * 


(16) 



330 


N Mukunda 


For n = 1 orn = 2 we have trivial situations: Ai {'tpi) is unity, and A 2 (^ 1 , ^ 2 ) is real positive. It 
is from n = 3 onwards that we have something interesting: in general, for n > 3, An(V^i, . . . , ^n) is a 
complex quantity. It can however be easily shown that the essentially new object is A 3 (^i , ^ 2 , '^ 3 ): 
for any larger value of n, A^ can be written as the ratio of a product of As’s by a product of A 2 ’s. 
Such an expression however tends to obscure the symmetry of An(^i,02,* • • ,^n) under cyclic 
permutations of its arguments. 

Now comes the link to the geometric phase. The definition (4.8) of the n-vertex Bargmann 
invariant requires only the choice of n ‘vertices’ pi, P 25 • • • ? Pn in 72, (in a definite sequence), with the 
vectors ^' 2 , • • • in 'H projecting onto the respective vertices. So far, we have no geometric 
phases. But to bring them in, we now create a closed figure in TZ by connecting pi to p 2 , p 2 to 
p 3 • • • 5 Pn to Pi by successive geodesic arcs - this results in an n-sided polygon in TZ with geodesic 
sides, and one can then ask for the geometric phase for this ‘cyclic evolution’. Going back to the 
basic definition (3.6) and repeatedly exploiting the special property (4.7) of geodesics TZ and their 
lifts in H, we find [ 8 ]: 

(fg[n - sided polygon in TZj vertices pi,P 2 — 1 Pm geodesic sides] 


= ^rg(^i5^i) - ^ciyn[^i2 UC23 U C34 • . . U Cn-l,n U Cn,l] 

” ~"V^dyn[^^2] V^dynf^^s] — <Pdyii[t^n,i] 

= —^p[Cl2] - ^p[C23] ^p[Cn,l] 

= --arg(t/;i,V^2) - arg(VJ2,^3) • • • - arg(^n, ^ 1 ) 

= -arg An(^i,^ 2 ,*..,^n)- (17) 

Here ^12,^23, ... are any lifts of the geodesics from pi to p 2 , P2 to ps, . . .; they run in TZ from 'tpi to 
'ip 2 ) ^2 to ^/?3 — Equation (4.10) is the connection between Bargmann invariants and geometric 
phases we are seeking. It is clear that this connection is a direct and simple consequence of the 
key property (4.5,7) of geodesics - their geometric phases vanish. It should also be clear that for 
this derivation to be meaningful, it was necessary to regard open curve geometric phases as the 
primary quantities, and closed curve geometric phases as derived ones. 

We may reemphasize the following point. The definition of the Bargmann invariant (4.8) 
requires only the choice of a sequence of vertices in TZ, there is no need to join these vertices in 

any way to form a (closed) figure. The latter step has to be taken only to find a connection to 

geometric phases; when this is done, we see that phases of Bargmann invariants are (apart from a 
sign) geometric phases. 

The point just made, however, motivates us to enlarge this connection. One can ask for the most 
general kind of curve that may be used to connect the vertices of a Bargmann invariant, such that 
the phase of the latter is the geometric phase of the resulting closed figure in TZ. This question 
can be answered and leads to the concept of null phase curves [13]. Geodesics are examples of 
null phase curves, but the latter are a much wider family than the former. These curves lead to 
the widest possible generalisation of the connection between Bargmann invariants and geometric 
phases. The given C in i? with lift C in Tii is a null phase curve if and only if 

9?^ [any connected portion of C] = 0 

ie. (pp[any connected portion of C] = (^^^^^[that portion of C] (18) 

Such curves can be nicely characterised at both ray and Hilbert space levels. Then one finds as a 
generalisation of eqn.(4.10): 


arg An(' 0 i, '1/^2, . • • , ^n) = [n ” sided figure in TZ, vertices pi,p 2 ,..., pn, 

sides as null phase curves] (19) 

This generalisation is indeed physically significant, and it leads to important results for instance 
with respect to coherent states [14], the interpretation of the classical Optical Guoy Phase as a 
geomeric phase[15], etc. 



Pancharatnam, Bargmann and Berry Phases 


331 


In concluding this Section we may also point out that the link (4.10) can be exploited to recover 
the original definitions (3.5,6) of the geometric phase by a limiting procedure[8]. All this should 
convince the reader of the intimate genetic relationship between Bargmann and Berry phases. 


5 Pancharatnam phases 

Now we turn to the third concept in this account, the Pancharatnam phase. Pancharatnam ’s 
original work[4] was in the context of classical polarization optics, and he was concerned with the 
phase relations between two pure states of polarization of a plane electromagnetic wave, such that 
their interference upon superposition would be maximal. We now explain his ideas in the language 
of quantum mechanics [16]. 

Let 'ipi and %l )2 be two unit vectors in the Hilbert space H of some quantum system. We ask 
the question - is there a way to define and measure the ‘relative phase’ of 'ip 2 with respect to 
ipi , and as a particular case can we give a meaning to the expression and *02 are in phase’ ? 
Pancharatnam ’s answer, motivated as we said above by the study of superposed pure polarized 
plane electromagnetic waves, amounts to the following: the vectors 0i and 02 are to be declared 
as ‘in phase’ if their Hilbert space inner product (0i,02) is real positive; if (0i,02) is complex, 
then its phase can be defined to be the phase of 02 with respect to 0i. (Naturally we assume that 
01 and 02 are not orthogonal). 

Accepting this definition, we see immediately that the first term (pp[C] in the definition (3.5,6) 
of the geometric phase is indeed the phase of the end point 0(52) of C with respect to the initial 
point 0(si) of C in the Pancharatnam sense. (This indeed explains our notation for this piece). 
Calling this a ‘Pancharatnam phase’ for a moment, we see that it is one of the two ingredients 
entering into the definition of a general open curve geometric phase. The other is the kinematic 
version of the Aharonov-Anandan definition of the dynamical phase. So this is one point where 
the Pancharatnam and the Berry ideas meet. 

A natural question now arises: is the Pancharatnam ‘in-phase’ definition transitive ? In detail, 
let 01,02,03 be any three unit vectors in % (no two being orthogonal). Suppose 0i and 02 are 
‘in phase’, and also 02 and 03 are ‘in phase’, both in the Pancharatnam sense. Then what can we 
say about the 0i — 03 relationship ? Are they also ‘in phase’ ? It turns out they are in general 
not in phase, the Pancharatnam criterion or definition is not transitive, and the mere existence of 
the Bargmann invariants makes all this immediately obvious! Let us see how this happens. 

Given 0i,02,03, first form the three- vertex Bargmann invariant 

A3(V’1iV’ 2,^3) = (^l,V’2)(V’2,^3)(V’3,^/’l)- (20) 

We are free here to alter the phases of the ■^’s independently without changing the left hand side. 
Keeping ipi fixed, let us adjust the overall phase of ^2 so that the first factor is real 

positive. This can be done, and it renders V’l and V'2 ‘in phase’ in the Pancharatnam sense. Then 
let us adjust the overall phase of ‘ips so that the second factor (^2 j' 03) also becomes real positive, 
and and V’3 are also ‘in phase’. But now we have no more freedom left! If A3(V>i, V’2jt^3) 
was complex to begin with, and this we recall is a ray space statement, then its phase must be 
reproduced by the third factor (t/’3,ti’i) in the product (5.1). But this phase is known to be a 
geometric phase. Thus the degree of nontransitivity of the Pancharatnam in - phase concept is 
expressed via a geometric phase which is just the phase of a three-vertex Bargmann invariant: 

till and ip 2 ‘in phase’, V’2 and ips ‘in phase’ =t- 


‘phase difference’ between ips and ipi = arg A3 (^i,V'2jV'3) (21) 

This discussion must make it clear how closely interwoven the three streams of ideas really are. 
For emphasis, at the risk of repetition, let us explicitly bring out these interconnections once again 
in a series of short statements: 



332 


N Mukunda 


(i) The Pancharatnam relative phase arg(0i,^2) between the end points of a curve C C is 
one of the ingredients in the calculation of the geometric phase (Pg[C]. 

(ii) Phases of Bargmann invariants are geometric phases of suitably constructed closed ray 
space figures. 

(iii) The very existence of the Bargmann invariants tells us why the Pancharatnam Hn phase’ 
concept is not transitive. 

(iv) This extent of nontransitivity is captured in the phase of a Bargmann invariant, and this 
is a geometric phase. 

At this point, let us mention briefly the original calculation of Pancharatnam, expressing it 
again in the present quantum mechanical language[16]. We have to deal with a two-level quantum 
sysem, for which the (complex) dimension of H is two. The corresponding ray space is just the 
Poincare sphere 5^, which is of real dimension two. If we take three unit vectors e H, 

their projections to ray space give three real unit vectors ni , n2, ns e S^. If we connect these points 
with geodesics or great circle arcs on 5^, we obtain a spherical triangle on 5^. Pancharatnam’s 
original result was that if and -02 ‘in phase’, and similarly 02 and 03 are ‘in phase’, then 0i 
and 03 are ‘out of phase’, and their ‘phase difference’ (ie, arg(03,0i)) is precisely one half of the 
solid angle subtended at the origin of 5^ by the spherical triangle having ni,n2,n3 for vertices: 

dimU = 2 ,^i,V' 2,^3 e ^(^1,2,3) = ^1,2,3 e 5^ : 
argA3(V’i,V'2,V'3) - 

Vt = solid angle of geodesic triangle with vertices ni,n2,n3. (22) 

We see many important features of this early result; it is an instance of the relation (4.10) between 
Bargmann invariants and Berry phases; and as expected the extent to which the amplitudes 03 
and 01 are out of phase is determined by a ray space quantity. Lastly, the solid angle on 5^ is an 
eminently geometric notion. 


6 Geometric phases from Lie group unitary representations 

In many physically interesting situations involving geometric phases, one finds that some unitary 
representation of some Lie group plays an important role. Thus even though we have not given 
full details we may mention the following examples: in polarization optics and in the quantum 
mechanics of a spin 1/2 particle with magnetic moment in a magnetic field, one has to work with 
the defining representation of the group 51/(2) [1]; geometric phases associated with coherent states 
involve the Heisenberg Weyl group[14]; squeezing transformations and related geometric phases, 
as well as the classical optical Guoy phase[15], are concerned with a unitary representation of the 
metaplectic group Mp{2 ) ; three level system geometric phases bring in the defining representation 
of the group 517(3) [17]. All these situations suggest that we study in general terms the structures 
of geometric and dynamical phases produced by unitary Lie group representations. In this Section, 
we give a brief account of this aspect of geometric phases, just to show the rich interplay between 
Lie group - Lie algebra structures and geometric phases [9]. 

Let G be an n-dimensional connected Lie group, either compact or noncompact; and let it be 
unitarily and faithfully represented by operators L((g) on the Hilbert space H of a quantum system. 
We do not require irreducibility of this representation. The hermitian generators Tr,r = 1, 2 , . . . , n, 
of K (g) are operators on % obeying the commutation relations corresponding to the Lie algebra G_ 
of G: 


[Tr,Ts] = iirs^Tt. (23) 

Here the coefficients frs * are the real structure constants of G. Now choose and keep fixed a 
fiducial unit vector 0o e H. We are interested in the calculation of dynamical phases for curves 



Pancbaratnam, Bargmann and Berry Phases 


333 


C cTi which begin from 3,nd axe built up by continuous group action. For this purpose some 
formal definitions are necessary. 

The orbit t?(^o) of “00 is the set of unit vectors in % produced by action of all elements of G 
on 00 : 


^('00 ) = {(pig) = 2Y(^)0o|0o fixed, geG} (24) 

We may picture this as a hypersurface of real dimension < n embedded in the unit sphere in H; 
evidently G acts transitively on t?(0o)* Next we define two stability groups Hq^H associated with 
00 : 


Ho — ^ G|^(^)0o = 0o} C G; (25) 

H - {g e G\L{{g)'ipo = (phase factor)0o} C G (26) 

Ho is the stability group of 0o in the strict sense, while H is the stability group of 0o upto phases. 
Both are subgroups of G, and moreover Hq is a normal subgroup of H. We can see that the 
relationship between Ho and H must be one of three possibilities, depending on the nature of the 
factor group H/Hq: 


{i)HJHo = trivial; 

{ii)HfHo = nontrivial discrete; 

{iii)HIHo - U(l). (27) 

Turning to the Lie algebras, in cases (6.4 (i), (ii)) it is clear that Hq and H share the same 
generators. Let us denote them by Ta, whero a = 1,2, = dimHo; so these obey a set 
of commutation relations on their own, involving the structure constants of Hq, and they all 
annihilate 0o: 

[Ta, Tt] =ijab^ Tc, a, 6, c = 1, ... A:; (28) 

Ta 00 = 0, a = 1, 2, . . . , A; (29) 

At this point we recognize that the (real) dimension of t?(0o) is {n — k). We can now add fur- 
ther generators of G, where fi takes on (n — k) distinct values. One then has the additional 
information: 


G = span {Ta.Tf,}, 


C^T;,0o = 00 ^ G^ = 0 (30) 

At this point we make an important assumption which is indeed valid in all physically important 
situations. This is the assumption that Ho is a reductive subgroup in G. This has the consequence 
that the additional elements needed to make up a basis for the Lie algebra G of G can be chosen 
so that we have the commutation relations 

[Ta,T,] = ifa^.‘'T,, (31) 

the point being that no generators Tt of Ho appear on the right hand side. In finite tenns this is 
the statement that the furnish some collection of irreducible tensor operators ■with respect to 
Hq. This conclusion is also obtained in case Hq is compact, as this guarantees the full reducibility 
of any representation of Ho', this is a special case of being reductive. 

In the case (6.4(iii)) the situation is a bit different. The generators of H consist of the generators 
Ta of Ho, and one extra generator which we denote as Y and which can be assumed to be normalised 
in a particular way. The relevant formulae in this case are; 

TaV’o = 0,Yrl>o- i’o; 

[Ta,F] = 0 


( 32 ) 



334 


N Mukunda 


Thus Y may be counted as one of the T^, which happens to be an i?o-scalar generator. 

One more concept is needed at the Lie algebra level before we can turn to dynamical phases. 
We denote by {Tp} all those generators of G which are scalars with respect to the stability group 
H, and which more over lie outside of Fq. Thus in terms of the three possibilities listed in eqn.(6.4) 
for the relationship between Hq and H, we have: 

Cases (6.4(i),(ii)) : Tp = Tp such that 

[Ta,Tp] = 0-, (33) 

Case (6.4 (iii)): Tp = Tp such that 

[Ta,Tp] = [Y,Tp] = 0, 

Y = one of the Tp. (34) 

The importance of these generators of G is that, because of the Wigner-Eckart theorem and keeping 
in mind eqn.( 6 . 8 ), 


{'tpo^Tri^o) 7 ^ 0 Tr = one of the Tp (35) 

Now let us specify the Hilbert space curves C for which we wish to examine the dynamical phase 
^dynt^]* These are curves lying in the orbit t?(^o )5 produced by a parametrised curve {^( 5 )} of 
group elements acting on ipQi 

C = {‘ipis) = ipig(s)) =:l({g{s))ilJo\gis) eG,giO) e Ho,0 < s < si}. (36) 

Here we have let the parameter s run from 0 to some si , and have ensured that the curve starts out 
from ^lJQ by demanding g{0) be in the subgroup Hq. Now by standard and wellknown arguments 
we know that the orbit 'i3('0o) can be identified in a natural way with the coset space G/Fq. This 
gives us the freedom to either think of the curve C as embedded in Hilbert space, or as lying in 
the coset space GjE^. The former view makes available the unitary representation U{g) of G and 
its generators; while the latter makes available the rich differential geometric structures known to 
exist on G and on G/Hq. Correspondingly it now turns out that there is a clean separation of 
^dynt^] algebraic and geometric ingredients. On the manifold of the group itself we know 

that there are left and right invariant vector fields, and say, generating the action of G 
on itself by right and by left translations respectively. Dual to these sets of vector field^ are the 
two sets of one forms, the Maurer-Cartan one forms, conventionally written as and and 
enjoying respectively left and right translation invariance. These forms can be viewed as paired in 
a reciprocal manner with the generators Tr of the representation U{-) of G, in fact with a basis for 
the Lie algebra G. The Lie algebra commutation relations among the vector fields are mirrored in 
the Maurer-Cartan relations obeyed by 6^^^^ and Now when we descend from the group G 
to the coset space G/Hq, (and for definiteness we take G/ Ho to be the space of right cosets g /f), 
the right invariant vector fields generating left translations do project down properly to globally 
defined vector fields Xr on G/Hq, generating the transitive action of G on G/Hq. The one forms 
0 (o)r^g(o)r general give rise to only locally defined one forms over G/Hq, except for the 
which are paired with (or go with) the i?-scalar generators Tp lying outside Hq . These lead to a 
set of one-forms 9^ on G/Hq which are globally well- defined. 

These one-forms are crucial for dynamical phase calculations. Indeed one finds, after a 
careful calculation, the interesting and compact result 

V’dynfC] = -(^o,Tp^o) J O”. (37) 

CcGJHo 

Here a sum on the repeated index p is implied. As indicated earlier, the standard Wigner-Eckart 
theorem of quantum mechanics plays a role in the derivation of this result, and in this context the 



Pancharatnam, Bargmann and Berry Phases 


335 


property of the generators Tp expressed by eqn.(6.10) becomes relevant. Beyond this practical use 
of the Wigner-Eckart Theorem, there is also a similarity in spirit between that theorem and the 
final result (6.12) for dynamical phases, in the sense that there is a clean separation into algebraic 
representation dependent factors, and differential geometric C-dependent factors. Indeed, even the 
dependence on the fiducial vector t/jq occurs only in the former factors. 

The result (6.12) is of course one ingredient in the calculation of geometric phases, the other 
being the Pancharatnam phase term. In any event, for a given group G, if one has a list of all 
possible subgroups Hq upto conjugation, the result (6.12) enables one to quickly determine those 
situations wherein nontrivial dynamical phases can occur; if due to the selection rules implicit in 
(6.12) we have a vanishing dynamical phase, we have a trivial situation wherein the geometric 
phase reduces to the nonlocal Pancharatnam contribution. 

Applications of the result (6.12) to the groups SU{S),Mp{2) and even to the familiar cases of 
SU{2) and 50(3), may be found in the literature[18]. 

7 Concluding remarks 

We have given a brief account of the theory of the geometric phase emphasizing two aspects - 
the organic links with the much earlier work of Pancharatnam and of Bargmann, and the deep 
connection to group actions and Lie group representations. The perceptive reader would not fail to 
be impressed by the subtlety and richness of the connections between the ideas of Pancharatnam, 
Bargmann and Berry; and it is certainly very satisfying to see how the final form of the geometric 
phase concept has brought all these strands together. We mentioned that Bargmann’s work was 
in the context of the Wigner unitary-antiunitary theorem for the representation of symmetry 
operations in quantum mechanics. It is but fair to say here that recently a new approach to the 
Wigner theorem based on Pancharatnam’s ideas has been presented [19]. 

The geometric phase concept has been exploited in the quantum field theoretic context where 
it links up to the so-called anomalies. A great deal of work has also been done to unravel the 
differential geometric features of this phase[20]. To keep this article within bounds, as well as to 
focus on the two main aspects mentined above, we have limited ourselves to an account within the 
framework of basic quantum mechanics. 


References 

[1] M.V. Berry, Proc. Roy. Soc. A 392. 45(1984). 

[2] A very useful reprint collection on the entire subject is “Geometric Phases in Physics”, A 
Shapere and F. Wilczek (eds.). World Scientific Publishing Co., Singapore (1989). 

[3] M.V.Berry, “Quantum adiabatic anholonomy” , Lectures given at the Ferrara School of The- 
oretical Physics on “Anomalies, defects, phases....”, June 1989. 

[4] S. Pancharatnam, Proc. Indian Acad. Sci A 44, 247 (1956); attention was called to this impor- 
tant work by S. Ramaseshan and R. Nityananda, Curr. Sci. W, 1225 (1986). 

[5] V. Bargmann, Jour. Math. Phys. 5, 862 (1964). 

[6] Y.Aharonov and J. Anandan, Phys. Rev. Lett. 1593 (1987). 

[7] J. Samuel and R. Bhandari, Phys. Rev. Lett. 60, 2339 (1988). 

[8] N.Mukunda and R. Simon, Ann. Phys. (NY) 228, 205 (1993). 

[9] N.Mukunda and R. Simon, Ann. Phys. (NY) ^8, 269 (1993). 

[10] R.Y. Chiao and Y.S. Wu, Phys. Rev. Lett. 51, 933 (1986). 



336 


N Mukunda 


[11] R. Bhandari and J. Samuel, Phys. Rev. Lett. 60, 1210 (1988). 

[12] R. Simon, H.J. Kimble and E.C.G. Sudarshan, Phys. Rev. Lett. 61, 19 (1988). 

[13] E.M.Rabei, Arvind, N.Mukunda and R.Simon, “Bargmann Invariants and Geometric Phases 
- A Generalised Connection”, Phys. Rev. A (in press). 

[14] S. Chaturvedi, M.S. Sriram and V. Srinivasan, J. Phys. A 20 , L1071 (1987). 

[15] R.Simon and N.Mukunda, Phys. Rev. Lett. 70, 880 (1993). 

[16] For a very lucid account of Pancharatnam’s own ideas, see R. Nityananda, Current Science 
67, 238 (1994). 

[17] G.Khanna, S.Mukhopadhyay, R.Simon and N.Mukunda, Ann. Phys (NY) 253 . 55 (1997); 
Arvind, K.S. Mallesh and N. Mukunda, J. Phys. ,OQ, 2417 (1997). 

[18] See, for instance, N.Mukunda “Group Theoretical Aspects of the Geometric Phase in Quantum 
Mechanics”, invited plenary lecture at the International Conference on Group Theoretical 
Methods in Physics, ICGTMP-98, University of Tasmania, Hobart, Tasmania, Australia, July 
1998. 

[19] J.Samuel, Pramana- Journal of Physics 48, 959 (1997). 

[20] The earliest step in this direction, B. Simon, Phys. Rev. Lett. M, 2167 (1983), actually 
appeared in print before reference (1)! 



13. The Skyrme Model for Baryons 


J. Schechter ^ and H. Weigeltj^ * 

Department of Physics, S 3 T:acuse University 

Syracuse, NY 13244-1130 


Center for Theoretical Physics 
Laboratory of Nuclear Science and Department of Physics 
Massachusetts Institute of Technology 
Cambridge, Ma 02139 


Abstract 

We review the Skyrme model approach which treats baryons as solitons of an effective 
meson theory. We start out with a historical introduction and a concise discussion of the 
original two flavor Skyrme model and its interpretation. Then we develop the theme, motivated 
by the large Nc approximation of QCD, that the effective Lagrangian of QCD is in fact one 
which contains just mesons of all spins. When this Lagrangian is (at least approximately) 
determined from the meson sector it should then yield a zero parameter description of the 
baryons. We next discuss the concept of chiral symmetry and the technology involved in 
handling the three flavor extension of the model at the collective level. This material is used to 
discuss properties of the light baryons based on three flavor meson Lagrangians containing just 
pseudoscalars and also pseudoscalars plus vectors. The improvements obtained by including 
vectors are exemplified in the treatment of the proton spin puzzle. 

*This work is supported in parts by funds provided by the U.S. Department of Energy (D.O.E.) under cooperative 
research agreements #DR-FG-02-92ER420231 #DF-FC02-94ER40818 and the Deutsche Forschungsgemein- 

schaft (DFG) under contracts We 1254/3-1 1254/4-1. 

t Heisenberg-Fellow 

1 Historical background and motivation 

The Skyrme model was born around 1960 in a series of increasingly more detailed papers [1]. At 
that time the prevailing dynamical model of nuclear forces was that of Yukawa which had been 
formulated in the 1930’s. Still to come was the concept of fractionally charged quarks and much 
further in the future was the recognition that the correct theory of strong interactions binds these 
quarks together with non-Abelian (color) gauge fields. 

In the Yukawa theory, of course, the nucleons are introduced as fundamental fermion fields while 
the spin zero pion fields are postulated to provide the “glue” which binds protons and neutrons 
into nuclei. This model is acknowledged to work reasonably well as a description of the long range 
interactions of nucleons and the prediction of the existence of pions has been amply confirmed.,, 
Skyrme’s innovation was to provide a model in which the fundamental fields consisted of just 
the pions. The nucleon was then obtained, in the initial approximation, as a certain classical 
configuration of the pion fields. The seeming contradiction of making fermi fields out of bose fields 
was avoided by arranging the classical field configuration to possess a non-zero “winding number”. 
In modern language this “Skyrmion” is an example of a topological soliton. Such objects are 



338 


Schechter and Weigel 


solutions to the classical field equations with localized energy density [2]. They play an important 
role nowadays in many areas of physics and the papers of Skyrme are justifiably recognized as 
pioneering milestones in this development. 

The years following this original idea saw the particle physics community actively investigating 
the approaches of quark models, flavor symmetry, current algebra, chiral dynamics, dual resonance 
models and finally color gauge theory to the problem of strong dynamics. Evidently Skyrme’s 
model was lost in the rush. However the novelty of the model did stimulate a few interesting 
papers [3, 4, 5, 6] before the more recent wave of activity in the area. 

At first glance, it might appear that a Lagrangian model built out of only pion fields could 
not be more different as a description of the nucleons from the current picture of three “valence” 
quarks containing a trivalent “color” index and bound together through their interaction with 
SU{3) gauge fields. Remarkably, it has turned out that the Skyrme model is in fact a plausible 
approximation to this QCD picture. This may be understood as follows. 

In QCD the gauge coupling constant has an effective strength which decreases for interactions 
at high energy scales (asymptotic freedom) but which increases at the low energy scales which 
are relevant when one considers the binding of quarks into nucleons and other hadrons. Thus the 
application of standard perturbation theory techniques to the problem of low energy interactions 
is not expected to be reliable and in fact has not produced definitive results. A natural alternative 
approach which retains the possibility of using perturbation theory is to imagine that the strong 
underlying gauge couplings bind the quarks into particles which may possibly interact with each 
other relatively weakly. At low energies these particles should evidently comprise the pseudoscalar 
meson fields (pions when restricted to two “flavors”). Then it is necessary to formulate some 
effective Lagrangian for the pions. Certainly the Lagrangian should be restricted by the correct 
symmetries of the underlying gauge theory. These must include an (approximate) SU{Nf) flavor 
symmetry [7], where Nf is the number of light flavors. 

But there is another symmetry which plays a crucial role. At about the same time that Skyrme 
was contemplating the model under discussion the correct formulation [8] of the structure of the 
effective weak (beta-decay etc.) interaction was discovered. It was noted that this interaction 
treated the left and right handed components of fermions on a completely separate basis. If this 
distinction is maintained at the level of the strong interactions one should impose a “chiral” left 
handed SU (3) flavor x right handed SU (3) flavor symmetry on the effective low energy Lagrangian 
of mesons. A consequence of this larger symmetry is that the meson multiplets must contain scalar 
as well as the desired low-lying pseudoscalar particles. This seemed a bit of an embarrassment in 
that the pseudoscalars are very light while the scalars were not well established and presumably 
heavy. Hence the possibility of a degenerate symmetry multiplet is implausible. Nevertheless 
it was realized [9] that the situation was likely to be similar to that met in the BCS theory of 
superconductivity in which the vacuum (ground state) is energetically favored to exist in a non- 
symmetric state. This “spontaneous breakdown” picture predicts, in the absence of a needed small 
explicit symmetry breaker, exactly zero mass for the pseudoscalars at the same time that the 
scalars are massive. In fact it may be formulated using a so called “non-linear realization of chiral 
symmetry” in such a way that the scalars do not appear at all [10]. The prototype Lagrangian 
density for this picture is 

c = ^tT {d^ud^u^)+..., ( 1 ) 

where U = exp(\/2i<jl>//7r), 0 being the 3x3 matrix of the ordinary pseudoscalar mesons and 
/tt = 93MeV the “pion decay constant” . ?7 is a unitary matrix which transforms “linearly” under 
the chiral transformations. Possible higher derivative and symmetry breaking terms have not 
been explicitly written here. It was demonstrated a long time ago that just this term compactly 
summarizes the low energy scattering of pseudoscalar mesons [11]. Improvements to this term 
form the basis of the “chiral perturbation theory approach” [12]. Now it is believed that a picture 
like this is expected from fundamental QCD. However the same Lagrangian was earlier written by 
Skyrme (in the two rather than three flavor case) in order to explain the nucleons [1] before the 
present justifications for it were known. 



Skyrme Model for Baxyons 


339 


Even in the framework of the chiral Lagrangian given above it would seem that there is no 
special a priori reason not to explicitly add baryons in a chiral symmetric manner rather than to 
build them out of the mesons. Indeed there have been many papers over very many years which 
do just this with reasonable phenomenological results [13]. Nevertheless there is an indication 
from fundamental QCD that the soliton treatment of the baryon is more natural. This arises from 
an attempt [14] to consider l/iVc, the inverse of the number of colors in the gauge theory, as a 
possible expansion parameter for QCD which might be meaningful even at low energies. In this 
approach the product = g^Nc, where g is the gauge theory coupling constant, is held constant, 
’t Hooft [14] showed that for large Nc QCD may be considered as a theory of mesons weakly 
interacting in the sense that scattering amplitudes or quadrilinear coupling constants are of order 
5 'efr = 1/A^c- Since the baryon mass must start out proportional to Nc (noting that the baryon in 
the Nc model is made of Nc quarks) it means that the predicted expressions for baryon masses 
should start out as the inverse of this coupling constant. In the framework of the (non-relativistic) 
mean field treatment Witten [15] not only pointed out that the baryon masses indeed grow linearly 
with Nc but also that baryon radii and meson-baryon scattering amplitudes are of the order Nq 
while baryon-baryon scattering is of 0{Nc). He in particular recognized that this inverse behavior 
with ^eff is just the usual signal that the baryon state in question is a soliton of the effective meson 
theory. 

Naturally, one wonders how these “modern” justifications for the Skyrme approach relate to 
Skyrme’s original motivations. We are fortunate in having available a reconstructed talk on just 
this topic by Skyrme [16]. He mentioned three motivations: 1 ) The idea of unifying bosons and 
fermions in a common framework. 2) The feeling that point particles are inconsistent in the sense 
that their quantum field theory formulation introduces infinities which are only “swept under the 
rug” by the renormalization process. 3) The desire to eliminate fermions from a fundamental 
formulation since fermions have no simple classical analog. What seems more fascinating is his 
awareness that there were probably some “hidden” influences pushing him toward the soliton 
picture. Directly, these came from his fascination with Kelvin’s idea that the various atoms should 
correspond to vortices of different connectivities in some underlying liquid. In turn, his interest 
in Kelvin was sparked at an early age by the presence of a tide prediction machine, designed by 
Kelvin and constructed by his great-grandfather, still occupying space in his great-grandfather’s 
house. An interesting account of this aspect is given in a paper of Dalitz [17]. 

Thus it seems that Skyrme’s motivations were not those currently used to justify his model. 
In particular it appears that he did not choose his Lagrangian model to describe spontaneous 
breakdown of chiral symmetry. Rather the non-linear form was chosen to insure that the pions 
were “angular” variables which would give multi-valued functions; the crossing of different sheets 
of these functions might then correspond to singularities which would realize the baxyons. The 
evident “moral” of this historical discussion is just that interesting ideas have an uncanny way of 
turning out to be useful and true. In this spirit, we would like to continue with the application 
of Skyrme’s ideas to current research on baryon physics, making use of current motivations but 
trying to avoid getting enslaved by them. 

2 The Skyrme model for two flavors 

In this section we will present the basic technology of the Skyrme model for baryons. The starting 
point for the construction of a soliton solution is the non-linear sigma model Lagrangian (1) already 
introduced in the previous section. As we require a finite energy density the chiral field U must 
approach a constant value at spatial infinity. We are free to choose this to be unity, i.c., 

( 2 ) 

Irl-^oo 

This can be considered a mapping from compactified coordinate space, a three-sphere 5^, to the 
space which is described by the unitary, unimodular matrix [/, namely SU{Nf)y where Nf denotes 
the number of flavors. In the case of two flavors the target space is isomorphic to S^. The mappings 
53 ^ distinct equivalence classes. This signals the existence of soliton configurations 



340 


Schechter and Weigel 


because members of different classes cannot be continuously transformed into one-another. The 
equivalence classes are characterized by the winding number. This number counts the coverings of 
the target space and is the charge / d^icBo associated with the topological current 

^ W tr [{U^d-'U) {U^d^U) {U^d^U)] . (3) 

When later discussing the three flavor case we will see that this topological current indeed equals 
the baryon number current. 

Although these topological considerations allow the existence of soliton solutions it turns out 
that the dynamics of (1) do not lead to static stable classical solutions. This can be deduced from 
a simple consideration, known as Derrick’s theorem [18]. Assume Uoir) to be such a solution. The 
static energy of Uo(Xr), obtained from the Hamiltonian of (1.1), would then be 

= (4) 

which does not have a minimum at A = 1, in contradiction to the assumption. In order to obtain 
stable solitons Skyrme added a term to the Lagrangian which is of fourth order in the derivatives, 

^ tr ( [U^d^U, U^duU] [U^d^U, U^d''U] ) . (5) 

Here e is the dimensionless “Skyrme constant” . Although this term is quartic in the derivatives 
it was arranged to be at most quadratic in the time-derivatives. This makes the quantization 
feasible. It is now apparent that a scaled configuration may well lead to a minimum of the energy 
functional 


Eil°^^[Uo{Xr)] = jEif^\Uo{r)] + XE^'^\Uo{r)] ( 6 ) 

at A = 1 provided the configuration Uq{t) satisfies E^^^^\Uo{r)] = E^^\Uo(r)], 

A priori the Euler-Lagrange equations of motion for the chiral field Uo{r) are highly non-linear 
partial differential equations. To simplify these equations Skyrme adopted the famous hedgehog 
ansatz 


Uo{r) = exp {ir • rF(r)) , (7) 

where r represents the Pauli matrices. This form may actually be traced back to the old “strong 
coupling” theory [19]. Upon substitution of this ansatz the energy functional turns into a simple 
integral involving only the radial function F(r), 

E[F] = dx I [x^F'^ + 2sin^F) + sin^F (2F'^ + j . (8) 

Henceforth this radial function will be called the chiral angle. In eq (8) a prime indicates a 
derivative with respect to the dimensionless coordinate x = ef-j^r. In this manner we have com- 
pletely extracted the dependence on the model parameters. Imposing F(oo)=0 and noting that 
/ d^rBo = {F(0) — F(oo))/7r leads to the boundary condition F{0) = tt for a unit baryon number 
configuration. The profile function depicted in Fig. 2,1 minimizes (8) and is obtained numerically. 
The energy obtained by substituting this solution into (8) is found to be E = 2^21: f^/e [6]. 



Skyrme Model for Baryons 


341 



As the ansatz (7) is not invariant under separate spatial or flavor rotations this field configu- 
ration does not yet describe states with good spin and flavor quantum numbers. As the first step 
towards generating such states, time-dependent collective coordinates A{t) are introduced which 
describe the spin and flavor orientation of the hedgehog, 

U{r,t) = A{t)Uo{r )A^{t), A(t) € SUiNj) . (9) 


Note that the hedgehog structure causes rotations in coordinate and flavor space to be equivalent. 
For generality we assumed an arbitrary number of flavors. In the two flavor case this configuration 
yields the Lagrange function 

LiA,A) = ^a^il^-Eci (10) 

where the quantity Cl measures the time dependence of the collective coordinates 

= ( 11 ) 


The constant of proportionality, is computed as a spatial integral over the chiral angle to be 

= 53.3/(e^/,r). Computing the spin as the Noether charge of spatial rotations yields J = 
Apparently the system displays all the features of a rigid top. In that language and oc are 
denoted as the angular velocity and the moment of inertia, respectively. 

In the second step the collective coordinates are elevated to quantum variables. Again this is 
completely analogous to the quantization of the rigid top and gives the quantization rule [J), Jj] — 
iUjk Jk ■ For the hedgehog ansatz in SU (2) spin and isospin are related via the adjoint representation 
of the collective coordinates, i.e. Ii = —DijJj with Hy = (l/2)tr(r<ArjAl) due to the equivalence 
of the respective rotations. Hence only states which have identical spin and isospin are allowed in 
the spectrum. These are the nucleon (J = J = 1/2) and the A-resonance (/ = J = 3/2). Finally 
the Hamiltonian for the collective coordinates is given by 


Ffcoii = EJci + 



( 12 ) 


which yields the A-nucleon mass difference 

Me,- Mn = ^ • ( 13 ) 

Using the physical value for the pion decay constant, U = 93MeV requires us to choose e « 4.75 
to reproduce the empirical mass difference of 293MeV. Substituting e « 4.75 into eq (8) yields the 
classical nucleon energy E = 23.27r/^/e « 1430MeV. This is not in especially good agreement with 
the experimental value of about 939MeV. However the following points must be kept in mind. 




342 


Schechter and Weigel 


(i) The meson Lagrangian consisting of eq (1) plus eq (5) contains only pseudoscalars. We would 
expect that other low mass mesons (notably the vectors) should also be included. The large 
Nq expansion [14, 15] requires an infinite number but common sense suggests a reasonable 
approximation for explaining hadronic physics up to about IGeV would be to keep those 
mesons with masses up to this value. Certainly the predictions in the mesonic sector of the 
theory are noticeably improved by the inclusion of vector mesons. The consistency of the 
overall picture requires accurate predictions both in the mesonic and baryonic sectors of the 
effective theory. 

(ii) In nature there are three rather than two “light” flavors and this aspect should be included in 
a realistic formulation. (This feature also makes more transparent the origin of the topological 
current eq (3).) Furthermore the effects of flavor and chiral symmetry breaking mediated by 
the finite values of the quark masses have not yet been taken into account. 

(iii) Order of corrections to the nucleon mass which have the structure of the Casimir effect 
in field theory have also not been included. These quantum contributions to the energy have 
been estimated to be negative and of the order of a few hundred MeV, predicting a total 
nucleon mass at the order of the experimental value [20]. Nevertheless one should be cautious 
about these quantum corrections, after all the Skyrme model is not renormalizable, leaving a 
logarithmic scale dependence of the “renormalized” Casimir energy. It seems that at best the 
quantum corrections can be computed in a scenario compatible with the chiral expansion. 

We will postpone the discussion of a variety of nucleon (and other baryons’) properties until 
after we have treated the more general case of flavor SU{Z). 

To end this section on the basics of the Skyrme model we would like to briefly discuss the 
consistency of the Skyrme model with the large Nc picture of QCD. In section 1 we have already 
noted that the quadrilinear coupling between mesons scales like 1/Nc- To check this behavior it 
is convenient to expand the non-linear a model Lagrangian (1) in powers of the pion field: 

^dfj,7r ‘ d^TT + ^ |(7r • d^Ttf - 7r^<9^7r • 5^7r| 4- O (tt®) . (14) 

Since the quadrilinear coupling constant is 1//^ we deduce that f-j^ ~ This agrees with 

general arguments [15]. Similarly the Skyrme term (5) provides a quartic pion interaction with 
the coupling constant l/(e^/^) which implies e ^ 1/y/Nc- Hence the classical energy (8) grows 
linearly with the number of colors as asserted from the corresponding generalization of QCD. 
Moreover, without flavor symmetry breaking large Nc QCD predicts the baryons of different J to 
be degenerate [21]. This is perfectly consistent with the meiss formula (12) because the moment 
of inertia also grows linearly with Nc as is indicated after eq (11), hence the second term in (12) 
behaves like 1/iVc- 

Actually, the understanding of the Nc expansion for baryons involves some subtleties. Consider 
the construction of large Nc baryons in the quark model. The lowest lying baryons are made of Nc 
(taken to be odd) quarks in a totally antisymmetric (i.e. singlet) color spin state with no orbital 
angular momentum. One expects particles of all total angular momenta from J = 1/2 to J = Nc/2 
to be obtained. In agreement with the spectrum of eq (12) we expect / = J for these particles and 
an infinite number of them as Nc oo. The trouble is that there is no experimental evidence for 
any I = J 5/2, 7/2 etc. particles. This may be interpreted as evidence that Nc = 3 in nature. 
Still, the large Nc expansion is useful if one computes a quantity which exists in the Nc = 3 
theory as a (presumably quickly convergent) Taylor series in 1/Nc. For the specific case of the 
higher excitations J = 5/2, 7/2, . . . the above treatment of the rotational modes seems inadequate 
because the rotational energy gets as large as the classical contribution. By including these modes 
in the Euler-Lagrange equations the widths of these higher excitations have been estimated to be 
comparable to their masses [22]. This makes a particle interpretation of these states problematic 
suggesting that they are artifacts of the collective quantization method employed rather than of 
physical relevance. Possible caveats for these calculations are the instability of these configurations 
against emitting pions [23] and that the results are only obtained by analytical continuation in the 
spin variable. 



Skyrme Model for Baryons 


343 


3 Chiral symmetry and its breaking 

In this section we will briefly discuss the concept of chiral symmetry which represents a guiding 
principle for extending the Skyrme model. Attention will be limited to those aspects of this large 
subject which have direct relevance to the study of Skyrmions. The basic idea is to construct a 
model of meson fields which “mocks up” as many symmetries and properties of the fundamental 
QCD Lagrangian as possible. 

3.1 The QCD Lagrangian 

Let us first recall the matter piece of the QCD Lagrangian 

/=1 

Nf 

~ X + 94) Qf.L -b ?/,R -b 94) ^/,R “ (?/,Lg/,R "b qfMfX)} ■ (15) 

/=1 

Here is the matrix representation of the gluon fields and g the quark-gluon coupling. Most 
notably we have introduced the chiral representation for the QCD current quarks 

gL,R = ^ (1 T 75 ) g (16) 

of each flavor. 

Strictly speaking, the quark mass terms are not part of the QCD Lagrangian but arise from 
the Yukawa terms of the full microscopic theory of nature. A major unsolved problem is to 
understand the resulting pattern of quark mass parameters. The phenomenologically determined 
masses [24] are « 5MeV,md « 9MeV,ms « 120 — 170MeV,mc « l.SGeV, « 4.5GeV and 
nrit « 175GeV. This random looking perturbation of the “strong interaction” plays a crucial role 
in determining the nature of elementary particle physics. In the region up to about IGeV it is not 
possible to produce particles containing c, 6 or t quarks. Then it is usually a good approximation 
to simply drop them from the theory. Other approximations are useful when dealing with the 
subspace carrying the flavor quantum number of a single “heavy” quark [25]. Furthermore in the 
sector of the three “light” quarks d, s it turns out to be fundamental to neglect the u, d, s masses 
as a first approximation and include their effects as a perturbation [26]. This is reasonable because 
the light masses are less than the quantity Aqcd ^ 250MeV, the scale below which the QCD 
effective coupling gets extremely large. 

In the case m/ = 0 the Lagrangian (15) specialized to the three light quarks has the global 
chiral symmetry 

/qu\ 

C/l(3) X ;7r(3) : ql — > Lqi and qr — > Rqr with 3l.r = I 1 (17) 

where L and R are each 3x3 unitary matrices. Using Noether’s theorem on the classical Lagrangian 
then yields the conservation of the eighteen vector and axial vector currents 

Jij = and = qjYiziu (18) 

where the latin indices run over u,d,s. These currents play an important role in the theory of 
weak interactions. 

Now a major discovery of quantum field theory is that consequences of the classical field equa- 
tions of motion (which can be used to verify the conservation of the Noether currents) do not 
necessarily hold at the quantum level. It is necessary to consider whether there exists a suitable 
regularization of the divergent diagrams of the theory which maintains the classical relations. In 
the present case, the axial singlet current^ = 57^^75 (A^/2)g is not conserved even for massless 

^Here A“, a = 1, . - . , - 1 denote the Gell-Mann matrices of SU{Nf) while A° refers to the singlet generator. 



344 


Schechter and Weigel 


quarks. Rather its divergence is proportional to the gluon field tensor times its dual. This is a result 
of the well-known Adler-Bell-Jackiw (ABJ) triangle anomaly [27] contained in the loop-diagrams 
shown in figure 1. 



Figure 1: Adler-Bell-Jackiw anomaly. The arrows indicate quark lines, the curly lines refer to the 
gauge bosons and the dashed lines denote the coupling of the aixial singlet current Jg . 


The net result is that the true global symmetry of the massless quantum theory is not 17l(3) x 
J7r(3) but C/v(l) X 5?7l(3) x 5J7r(3). The singlet vector symmetry, i7v(l) corresponds to baryon 
number conservation. 

A similar situation emerges when external c-number flavor gauge fields are added to the massless 
QCD Lagrangian in order to further probe its structure. Then the so-called non-Abelian anomaly 
yields non-zero covariant derivatives of the SUl{3) x SUr{3) currents proportional to certain 
combinations of the corresponding external gauge fields [28]. The non-Abelian anomaly will be 
noted to have important consequences for the theory of Skyrmions. 

Although the true symmetry of massless three flavor QCD is 5i!7L(3) x 5i7R(3) x Uv{l), the 
resulting symmetry of the physical states of the theory is further reduced to SUy{3) x Uv{l) by 
the “spontaneous breakdown” mechanism. In this mechanism the vacuum state is not invariant 
under the full symmetry group. The massless QCD vacuum is characterized by a non-vanishing 
“condensate” {ququ + qdqd + qsqs) 0. Under an infinitesimal chiral transformation L = 1 4- 

i lCa=o i? = 1 -h i variation of this quark-bilinear is found to be 


S (qq) = i(€l- €^) = («£ “ ^r) *759 ■ (19) 

Clearly the condensate is invariant only for the subgroup i = i? in eq (17) which is a vector 
type transformation. This explains the physical SUy{3) x ?7v(l) invariance. Note that the right 
hand side of eq (19) represents pseudoscalar objects. These are “zero mode” fluctuations of the 
above vacuum configuration and the corresponding massless pseudoscalar particles are designated 
Nambu-Goldstone bosons. Their scalar chiral “partners” - which would be degenerate in mass 
were it not for the spontaneous symmetry breakdown - are not constrained to be massless. This 
splitting of low-lying pseudoscalars and scalars expected from massless QCD seems in qualitative 
agreement with the experimental situation. 

The SUy{3) X I7v(l) invariance (so-called “eightfold way”) of massless QCD is, of course, 
further broken when the effects of non-zero quark mass terms are included. For later purposes it 
is convenient to rewrite the quark mass terms as: 


^mass 

^QCB — 


rrixi “ 1 ” 


qMq with 


.. 2-|-a: 1 

M — — - — 3 + yXz + 


%/3 


■As. 


2 


3 


( 20 ) 



Skyrme Model for Baryons 


345 


Here characteristic quark mass ratios are defined by 

2ms m-u — TTid 

Q. — ^ ^ ^ 

-f md m^i + rrid 

In the limit y ~ 0, the theory possesses SU^/(2) or isospin invariance. 


(21) 


3.2 Effective Lagrangian of pseudoscalars 

The simplest way to mock up low energy QCD is to employ a 3 x 3 matrix field Mij which 
transforms under the chiral group in the same way as the bilinear quark combination qjRQiL- It 
has the decomposition M = 5 + iP into hermitian scalar and pseudoscalar components. A chirally 
invariant Lagrangian is 

C = - V{M, (22) 

4U 

where the potential V is a function of invariants like tr(MM^), ti(MM^MM^) etc. Spontaneous 
breakdown to SUy(3) is implemented by choosing V to have a minimum such that (M) = const, x 1 
Then the scalar fields S become massive and can be ‘‘integrated out” by imposing a chirally 
invariant constraint [10]. This represents a transition from the linear to the non-linear sigma 
model. Formally we may use the “polar decomposition” of the matrix M into unitary and hermitian 
factors M = HU. Setting H const, x 1 then results in eq (1) again. 

In principle, a scenario of this sort can be derived by adding a term like qiMqR + h.c. to 
the QCD Lagrangian and then integrating out the quark fields^. As a result one is left with a 
complicated action functional for M and, by further eliminating H as above, for U. Note that U 
inherits the chiral transformation property of the quark bilinear: 

U -4 LUR^ . (23) 

It is this identification of the transformation properties which provides the important link of the 
effective chiral theory to QCD since it in particular implies that the (Noether) currents must be 
identified. In turn, matrix elements of these currents will yield the static properties of hadrons. 

Of course, in the chiral limit we demand the effective meson theory to be strictly invariant 
under the transformation (23). Since UW ^ 1 only derivative terms can appear and the leading 
one is just the non-linear a model (1), Clearly also the Skyrme term (5) is invariant under (23). 

At this point one essential ingredient is still missing to ensure that the chiral field, U = exp(i$) 
describes pseudoscalar fields, This requirement demands the parity transformation 

(24) 

However, it is straightforward to verify that the Skyrme model Lagrangian is invariant under (24) 
and r — r separately. On the level of the equations of motion we can easily break this unwanted 
extra symmetry by adding a term which contains the Levi-Civita tensor. With = {dfiU)U^ we 
write 

+ + = 0 , ( 25 ) 

where the ellipsis refers to the contributions from the Skyrme term (5) which have the same 
symmetries as those from the non-linear a term (1). Unfortunately, the additional term cannot 
be easily incorporated in the effective Lagrangian since it does not correspond to the variation of 
a local term. Witten suggested [30] to include it at the level of the action because the variation of 

r = xf ( 26 ) 

^In the case of QCD this seems to be impractical. However, the simpler Nambu-Jona-Lasinio [9] model for 
the quark flavor dynamics nicely exemplifies how a meson functional can be constructed by integrating out quark 
degrees of freedom [29]. 



346 


Schechter and Weigel 


and the use of Stoke’s theorem yields the desired term in the equation of motion (25) provided the 
boundary of the five dimensional manifold, M 5 is taken to be Minkowski space, i.e. dM^ = M 4 . 
The choice of M 5 is not unique because its complement has the same boundary. In order to 
nevertheless have a unique action the constant A = 2 ^ 6 ^’^ ^ ^ quantized^. It is 

interesting to study the physical relevance of (26). Expanding in the meson fields $ and employing 
again Stoke’s theorem reveals that it describes processes with at least five different pseudoscalars^ 
like As such processes were first discussed by Wess and Zumino [31] who 

essentially found a power series expression for (26), the term is commonly named after them^. 

There are further important consequences of the Wess-Zumino term (26) which can be read off 
after generalizing it so its variation with respect to external (electro-weak) gauge transformations 
[30, 32] yields the non-Abelian anomaly [28]. After appropriately including the corresponding 
gauge boson fields two striking features are observed: 

(i) A contact interaction for the decay 77 is contained in the gauged Wess-Zumino action. 
On the quark level this process is described by the ABJ anomaly involving the diagrams of 
figure 1 with the external lines representing photons. Identifying that result with the Wess- 
Zumino term requires setting n = i-e, the Wess-Zumino term is proportional to the 
number of colors. 

(ii) The linear coupling to the f7v(l) gauge boson represents the baryon number current. Indeed 
it turns out that this current is identical to the topological current in eq ( 3 ). 

The mocking up of the effects of the i7A(l) anomaly in QCD involves the 5(7(3) singlet pseu- 
doscalar particle rj' and will be discussed later when we treat the proton spin puzzle. 

We must also take account of the effects of the finite quark mass terms (20). These transform 
according to the chiral 5 £/l(3) x 5(7r( 3) representation: 3 x 3* -f 3* x 3. Note that the matrix 
M = M} (neglecting the possibility of strong CP violation) may be considered a “spurion” for this 
transformation property. Then the minimal symmetry breaking piece of the effective Lagrangian 
reads 


^SB = tr {M [-fi' [d^Udi^U^U + U^d^Ud>^U'<) +5' [U + U^ ~ 2)] } , (27) 

where and 5' are two numerical parameters. The 5' term is required to split the pseudoscalar 
meson masses while the term is required to split the pseudoscalar “decay constants” . The decay 
constants fa are defined from the axial vector matrix elements {0\j§^^\<l>aiP) = ifaP^ 

Working in the isospin invariant limit, the parameters yd', (J' and x can then be extracted from 
the knowledge of meson properties [ 33 ], 

< = m% = ^6'{l + x) and (^)' = 1 + ±^'(1 - x) . (28) 

This represents the essential input when discussing the Skyrme model for three flavors. To sum 
up, the Lagrangian of only pseudoscalars which we shall use for discussing Skyrmions consists of 
the sum of ( 1 ), (5), (26) and (27). 

In the chiral perturbation theory approach [12] essentially the most general chirally invariant 
Lagrangian is written down and ordered in powers of dd ^ M. For example, the leading terms 
are eq (1) and the 5' term of eq (27). The next-to-leading terms include: 

tT{d^ud>^u^d^ud''u^), tv{d^ud'^u^MMiu + u% tx{d^ud^^u^M{u + t/t)), 

[ix{M{U + 1/^)]^ [tx{M{U - MU^ + MU MU), (29) 

^The reader is referred to the literature [30] to see the analogy to Dirac’s quantization of the magnetic monopole. 
"^For that reason the term (26) vanishes in the case of two flavors which only has four different pseudoscalar 
fields. 

®Some authors refer to the Wess~Zumino term in its gauged form i.e. with external sources. Unless otherwise 
noted we will always understand the Wess-Zumino term to be (26). 



Skyrme Model for Baxyons 


347 


each with its own coupling constant. Note that the combinations of terms which can not be 
manipulated (by use of various matrix identities) to become a single trace are suppressed in the 
1/Nc expansion. This procedure also entails absorbing the divergent parts of loop corrections in 
the coefficients of the listed terms. The result is a joint power series in energy and the quark 
masses which can be expected to be very accurate quite near the tttt threshold in the case of 
pion-pion scattering for example. But going higher in energy is very difficult in this scheme. 
Furthermore it seems that many of the coefficients mainly simulate the low energy effects of vector 
meson exchanges. We shall also work with a meson Lagrangian which includes the vector particles 
directly. This may be thought of as a start on the approach of constructing the leading (Born) 
term of the l/Nc expansion, which should include mesons of all spins. 

3.3 Effective Lagrangian of pseudoscalars and vectors 

We will follow the so-called massive Yang-Mills approach [32] for introducing the vector meson 
nonet into the Lagrangian of pseudoscalars in a chirally invariant manner. In this approach both 
vector and axial vector fields are formally introduced as gauge fields (yielding invariance under 
local chiral transformations). Then globally invariant mass-type terms which break the local chiral 
invariance are included. Finally, as in the transition from the linear to the non-linear sigma 
model discussed in section 3.1 above, the (heavier) axial vector mesons are eliminated by a chirally 
invariant constraint. 

We introduce two multiplets with spin one, and which we demand to transform under 


(17) as left- and right-handed fields, respectively, 

^ L (^J; + ^5^) it aad iJ (^^ + ^5;.) ijt . (30) 

This allows us to define a covariant derivative for the chiral field and field tensors, 

D^U = d^U-igAlU + igUA^, (31) 

- d.A]:<^ - ig [Al'^, Ab""] , (32) 

which transform homogeneously under (17). The chirally invariant terms with a minimal number 
of derivatives read 

tr [{D^U)^D>^U] , tr and tr . (33) 

In addition we can have mass-type terms for the vector mesons 

ti[A^A^'>^ + A^A^''^] and tr [A^U , (34) 


which are still invariant under global chiral transformations. Of course, many more terms with 
higher derivatives could be written down at the expense of more undetermined parameters. Now, 
it is our aim to construct an effective model for the vector mesons only; at present we are not 
interested in the axial-vector mesons. We have to find a mechanism to eliminate the latter without 
violating the chiral symmetry. This can be accomplished by choosing a special “gauge” for the 
vector fields -4^’^, 

^ (p/x + (p/x + ^9^'^ ^ ■ (35) 

Here is a matrix field with iV| components. For example, in the case of two flavors it includes 
both the p and w mesons via • r + In the case of three flavors, this matrix field is 

supplemented by the K* and ^ mesons. Most importantly we have introduced the “square root”, 
^ of the chiral field, U = which yields the chirally invariant relation 

A], = U (Al+^-d^u^ . 


(36) 



348 


Schechter and Weigel 


It is actually this so-called unitary constraint which eliminates the axial-vector fields in favor of 
the vector fields p without spoiling chiral symmetry. 

It is interesting to study the behavior of the p meson under chiral transformations. To start 
off, we recognize that the transformation of ^ introduces the matrix K which is defined by [34] 

I 

^ ^ . (37) 

Clearly this leaves the transformation law of the chiral field (23) unchanged. Note that in general 
the matrix ii" is a position dependent quantity because of Demanding now the symmetry 
transformation 

p^-^K {p^ + (38) 

causes the fields to transform exactly like left- and right-handed vector fields. 

Within the unitary gauge the various terms listed above in (33) and (34) are no longer inde- 
pendent. Introducing the homogeneously transforming combinations 

= and R^ = p^ + YgM (39) 

the terms up to two derivatives can be combined to a chirally invariant Lagrangian of vectors (and 
pseudoscalars) 

£vm = tx -\flp^r-\Ff,v{p)F>^''{p) + m),R^R>^ , (40) 

where we have used the fact that the coefficient of the term quadratic in the p meson field is related 
to the vector meson mass rrip = 770MeV. Upon expanding the square-root field ^ in powers of the 
pseudoscalar field, one finds that the Lagrangian (40) contains the ptttt coupling, 

= 2^ Pm ■ ^ > (41) 

which can be utilized to fix the coupling constant g « 5.6 from the known decay-width of the 
process p -> tttt. 

Terms which involve the Levi-Civita tensor are also of great interest for the Skyrme 
model. For their presentation it is most useful to introduce the notation of differential forms: 

d = etc. . Since the left- and right-handed “gauge fields” are related via 

the unitary constraint (36) the number of linearly independent terms, which transform properly 
under chiral transformation as well as parity and charge conjugation, is quite limited [32] 

, dA^aA^ — A^adA^ 4- A^aA^a , 2 (A^)^ a + ^A^aA^a . (42) 

For convenience we have again made use of = (d^U)U^ = Of course, including these 

terms in the model Lagrangian will introduce three more parameters: 71,72 and 73. A suitable 
presentation of this part of the action is given in terms of p and R (employing again the notation 
of differential forms) 

Tc = Fwz -h tr Q 7i + ^72 Rp^ - “P72i^(p) [pR ~ Rp] ~ 9^ [72 + 273] j , (43) 

where Fwz- is given in (26). In ref [35] two of the three unknown constants, 71,2,3 were determined 
from purely strong interaction processes like u -> Stt, Defining h = -2\/l7i/3, gvV(i> = 972 
and K = 73/72 the central values h = ±0.4 and gw^ = ±1.9 were found. Within experimental 



Skyrme Model for Baryons 


349 


uncertainties (stemming from the errors in the uj — (j) mixing angle) these may vary in the range 
h = —0.15, . . . , 0.7 and gvv 4 > = 1-3, . . . , 2.2 subject to the condition \gvv 4 } “ ^ 1-5. The third 

parameter, k could not be fixed in the meson sector. From studies [36] of nucleon properties in 
the two flavor model it was argued that 1 represents a reasonable choice. 

The sum of the space integral of (40) and (43) comprise the chirally invariant part of the effective 
action we shall use for discussing the soliton in the vector meson model. Note that the second piece 
of (43) can be gauged with external fields [35] so as to make no contribution to the non-Abelian 
anomaly. The first piece Fwz then correctly supplies the non-Abelian anomaly. Furthermore the 
second piece of (43) stabilizes the soliton without the need for including the Skyrme term (5). 

We must still include the effects of symmetry breaking due to finite quark masses in the vector 
meson system. To leading order in the symmetry breaking, an appropriate term which behaves 
properly under chiral transformations, can be constructed by analogy to the last expression in (34) 

-a'tr [M {A^UA^^ -h A^U^A^f^)] . (44) 

This leading contribution not only distinguishes between the p and K* masses but also contributes 
to the different decay constants of the pseudoscalar mesons via the unitary gauge (35). The reader 
may consult ref [33] for recent discussion of higher order symmetry breaking terms. 

We would like to end this section on including vector mesons by noting that the same Lagrangian 
is obtained within the so-called hidden gauge approach [37], once the same symmetries are required. 
This shows that these two approaches are in fact identical. 

3.4 Other aspects 

We expect that baryons should appear as solitons of the large Nq effective meson Lagrangian 
for any number of flavors Nf. In the case of three (or more) light flavors the Wess-Zumino term 
guarantees, as discussed in section 3.1, that the baryon number (3) is obtained in a self-contained 
manner from the Lagrangian. This can be used to check that the soliton indeed has the correct 
baryon number. 

Now in the two flavor case, the same kind of soliton solution exists. However the Wess-Zumino 
term vanishes identically so we cannot similarly check its baryon number in a self-contained way. 
The situation is even more peculiar for = 1. There the Skyrme model represents a mapping 

-> which does not contain topologically stable configurations. However, we are not forced 
to use an effective Lagrangian of the same form. In this case it is probably more realistic to 
construct the Lagrangian by including isoscalars like the spin-0 cr-field and the spin-l cj-field. 
Such a Lagrangian might have a soliton solution (not necessarily topological) but a check of its 
baryon number may also not be available in a self-contained way. These examples seem to indicate 
that the form of the relevant effective Lagrangian may have a non-trivial Nf dependence (at least 
for small Nf). 

Another interesting question, related in the sense of understanding whether physical features of 
the solitons can be traced to particular pieces of the effective Lagrangian, concerns the stabilization 
of the soliton. In section 2 we noted that the Skyrme term (6) was introduced precisely for this 
purpose. There is an often mentioned “derivation” of this term from the piece of the vector meson 
Lagrangian (40) above which goes as follows. In a large mass expansion, -)► oo the equation of 
motion for the vector meson field simply becomes = 0. Substituting this into the remainder of 
the vector meson Lagrangian (40) 

{p) -4 F,, 

yields exactly the Skyrme term (5) with the identification ^ = e. Although the numbers 5,6 and 
4.75 are in reasonable agreement there is one caveat to this appealing derivation of the Skyrme 
term. While the Skyrme model does yield stable solitons, however, for arbitrary large but finite 
rrip the model (40) does not contain stable soliton solutions. Thus one seems to have achieved 
stabilization merely by approximating a model in which stabilization does not exist. Clearly 



350 


Schechter and Weigel 


we have not obtained a “physical origin” for the stabilization mechanism. As mentioned in the 
previous section, the second piece of (43) stabilizes the soliton in the vector meson Lagrangian 
without a need for the Skyrme term. It is also possible that, as in the case of the s-wave ground 
state of hydrogen, stability is achieved at the quantum, rather than at the classical, level. Several 
investigations of this possibility have been made [38] based on just the non-linear sigma model 
term (1), although an assumption on the allowed chiral profiles seems to be required. 


4 The Skyrme model with three flavors 


It is well established [7] that the neutron (n) and proton (p) belong to a multiplet with six other 
members (the iso-singlet A, the iso-doublet H and the iso-triplet S). To try to understand n 
and p alone is to look at only a small piece of a large picture. Thus we must consider the three 
flavor generalization of the treatment in section 2. First (in the present section) we shall consider 
the Lagrangian of pseudoscalars alone, discussed in section 3.2. The new features arise from the 
inclusion of flavor SU{3) symmetry breaking terms (see (27) together with (20) and (21)) as well as 
the Wess-Zumino term (26). Both of these features involve non-trivial extensions of the formalism 
and interesting “physics” . 

The first step towards including the strangeness degrees of freedom is to actually take the chiral 
field to be aU(l) (S) SU(3) matrix. To be precise, the three flavor chiral field is defined as 


U{x) = exp 



exp (i$) . 


(45) 


While the singlet field po is separated the matrix field $ now not only contains the pion degrees 
of freedom but also the kaons and the non-singlet component of the rj fields, 

+ \ 

TT ■ (46) 

Here A“ denote the Gell-Mann matrices. Note that in the presence of derivative— type symmetry 
breakers {e.g. the 0' term in (27)) the normalization of the fields gets shifted; the “physical” fields 
are gotten by multiplying the fields above by some constants as ZkK^, etc.; similarly the 

physical decay constants are Z-^f^, = 93MeV, Zk/k « 113MeV, etc. For the Z’s we have 


o ,/9 


0=1 


U 



We clearly need a suitable generalization of the Skyrme ansatz (7). It turns out that it is correct 
to just embed the SU{2) hedgehog in the 517(3) matrix. Flavor symmetry breaking implies that 
field configurations which have non-zero strangeness possess a classical energy which (at least in 
the unit baryon number sector) is larger than that of a zero strangeness configuration. Thus we 
choose the embedding: 


Uo{r) = 


exp {ir ' 

.rF(r)j 

0 

0 

0 

0 

1 


(48) 


Hence the classical energy will not be modified and the soliton profile, F{r) is that in figure 2.1. 
The effects of the strange degrees of freedom are hence visible when states with baryon quantum 
numbers are generated via the collective coordinate approach. 

The collective coordinate matrix A(t) defined in eq (9) is now taken from 517(3) and in analogy 
to eq (11), now leads to eight angular velocities, 


At(i)|A(t) = iy3A“n,. 

a=l 


( 49 ) 



Skyrme Model for Baryons 


351 


In addition to the angular velocities Qa the adjoint representation 

I>a6 = ltr(AaAAtAt) (50) 

of the collective rotations, A(t) will be important, in particular in the context of flavor symmetry 
breaking. 

Substituting U = A{t)Uo{r)A^{t) into the pseudoscalar Lagrangian of section 3.2 without the 
symmetry breaker gives rise after a spatial integration to the collective Lagrangian 


-^Skyrme (-^5 A) + LwZ‘(-^5 




1=1 


a=4 


(51) 


The SU (2) moment of inertia remains unchanged while the moment of inertia for rotations 
into the strange directions is a new functional of the pseudoscalar fields. The fact that the eighth 
component of the angular velocity vector does not appear quadratically in (51) is a consequence 
of [f7o) As] = 0. The term proportional to Bfis, where B is the baryon number arises from Fwz- 
In order to obtain it we make use of the separation [39] 

rwz[i7] = rwz[t/o] - ^ { [(U^dUo)^ + (UodU^)^] (AUA)} , (52) 


where, again, Stoke’s theorem has been employed. As Uq is static we have Twzlt^o] = 0 and the 
remainder becomes a local object which is straightforwardly evaluated. 


4.1 Quantization of the three flavor collective Lagrangian 

In order to quantize the three flavor Lagrangian (51) we require the operators for spin and 
flavor as Noether charges. As a consequence of the hedgehog structure, the infinitesimal change 
under spatial rotations can be written as a derivative with respect to Q 

[rxd,U{r,t)] = ^^^. (53) 

By the Noether construction this leads to the spin operator J = dL{A, Vta)/dQ.. The quantization 
of the “5(7 ( 3 ) rigid top” proceeds by generalizing this result to the so-called right generators 

Q r —0?^a = “ Ja, a=l,2,3 

Ra = (Lskyrme + Lwz) = < a=4,..,7 . (54) 

alia N qB o 

I 2n/3’ ^ ^ 

The quantization prescription then demands the commutation relation [Ra,Rb] = '-^fabcRc with 
fabc being the antisymmetric structure constants of SU (3). Explicit expressions for these generators 
in terms of an “Euler-angle” parameterization of A are presented in ref [40]. The so-called left 
generators, which are defined by the rotation La = DabRb^ satisfy the commutation relations 
[LajXft] = ifabcLc- They provide the isospin, U = Li {i = 1,2,3) and hypercharge, Y = 2L^/y/3 
operators. 

The generator Rs is linearly connected to the so-called right hypercharge Yr = 2Rs/V^ == 1 
for = 1 and iVc = 3. In analogy to the Gell-Mann Nishijima relation a right charge 

Qr = -J3 + y 

may be defined. Completing the analogy we note that the eigenvalues of Qr are 0, ±1/3, ±2/3, 
± 1 , . . .. Hence for = 1 the relation (55) can only be fulfilled when the eigenvalue of J 3 is half- 
integer. This yields the important conclusion that the SU (3) model describes fermions. A priori 
this is not expected since the starting point has been an effective model of bosons. This discussion 



352 


Schechter and Weigel 


can be generalized to arbitrary Nc showing that the Skyrmion describes fermions when Nc is odd 
and bosons when Nc is even [30]. This, of course, is expected from considering baryons as being 
composed of Nc quarks. We conclude that the proper incorporation of the anomaly structure of 
QCD leads to the desired spin-statistics relation. 


4.2 Flavor symmetry breaking and baryon spectrum 

For a realistic treatment of baryon states in the space of the collective coordinates we have 
to supplement the collective Lagrangian by the flavor symmetry breaking pieces associated with 
(27). Substituting the flavor rotating hedgehog yields the symmetry breaking piece in the collective 
Lagrangian, 

TsB = -|7(l-i?88) (56) 

with the coefficient, 7 being linear in the symmetry breaking parameter 1 - x, i.e. 

y=^{x-l) J dr {d'r2(l - cosF) - fi'cosF (F'V^ + 2sm‘^F) } . (57) 

Dss{A) is defined in eq (50). Putting pieces (51) and (56) together, the Hamiltonian for the 
collective coordinates is obtained as the Legendre transform H = - X)a=i L 


H{A,Ra) 




/32J 


+ ^C2(5C7(3)) - ^ + (1 _ Fss) (58) 


for 5 = 1 and Nq = 3. The constraint J ?8 == ^, which yielded the spin-statistics relation, 
commutes with H permitting one to substitute this value. The term involving ]d =4 has been 
re-expressed by introducing the quadratic Casimir operator of SU{3), C2{SU{3)) = ]Ca=i ^a- The 
standard SU(i) representations are eigenstates of C2(SU(3)) with eigenvalues /i. For example, the 
octet representation 8 has /is = 3 while /iio = jjljq = 6 and /X 27 = 8 . These representations 
diagonalize® the collective Hamiltonian in the absence of symmetry breaking, 7 = 0 . 

Now consider the full collective Hamiltonian including the symmetry breaking. It seems rea- 
sonable to assume these 7 = 0 eigenstates as a basis to diagonalize the full Hamiltonian. In a 
perturbative treatment up to third order in 7 for the baryons only the representations 8, 10 
and 27 contribute [42]. For that reason the perturbative treatment is still simple, although one 
must go beyond leading order. In particular this implies that the nucleon is no longer a pure 
octet state but rather contains sizable admixture of the nucleon type states in higher dimensional 
representations, 


\N) = |iV, 8 ) + 0 . 07457 / 3 ^ |iV, 10) -I- 0.04907/3^ I A^, 27) -f . . . , (59) 

where the coeflScients of the effective symmetry breaker 7 / 3 ^ are computed from 517(3) Clebsch- 
Gordon coefficients [43]. The nucleon is seen to have a roughly 25% amplitude to contain the 10 
state. 

Although this perturbative treatment provides a physical picture of the symmetry breaking 
effects it actually turns out that the full Hamiltonian (58) can be exactly diagonalized numerically. 
The important ingredient is that within a suitable “Euler-angle” representation of the rotations A, 
the symmetry breaker 1 - Dqs depends only on one of these eight angles. In each isospin channel 
the eigenvalue equation 


[C2{SU{3)) + 13^ j (1 - Dss)] ^ (60) 

then reduces to a set of coupled ordinary differential equations which can be integrated numerically. 
Here we do not wish to discuss this approach in full detail; rather we refer the reader to the original 

hedgehog structure of the classical configuration Uo constrains the permissible SU{Z) irreducible represen- 
tations to those which have at least one state with I = J [41]. 



Skyrme Model for Baryons 


353 


work by Yabu and Ando [44] and exhaustive applications of this method involving the present 
authors [45, 46, 40]. Having obtained the eigenvalue ess the baryon masses are straightforwardly 
computed from 


+ + . (61) 

As already mentioned this diagonalization procedure is equivalent to the perturbation expansion. 
For small enough symmetry breaking even first order is sufficient. In that (unjustified) case 
the famous Gell-Mann-Okubo mass formulae [47, 7] holds exactly: 


2(Mjv-hMs) = Ms + 3Ma (62) 

Mq — Ms* = Ms* “ Ms* = Ms* — Ma . (63) 

Additional corrections [45] arise when we allow for non-zero classical iiT-meson fields to get 
induced by “rotations” (1^ into the strange directions. These are energetically favorable since they 
maximize the strange moment of inertia With a parameterization 


K+ 


= VF(r) f • r 


f n4 — \ 

^ fie "■ / 


(64) 


the radial function W (r) is determined from applying a variational principle to /3^. In principle one 
must enforce that the ansatz (64) has no overlap with any global rotation of the classical solution 
(48). 

We adjust the only free parameter, e 4 to the mass differences of the low-lying I"*" and 
baryons. The resulting baryon spectrum is shown in table 1. Apparently the three flavor Skyrme 


Table 1: The mass differences, which are obtained by exact diagonalization of the collective Hamil- 
tonian (58), of the 1+ and baryons in the pseudoscalar model for e=4.o are compared to the 
experimental data. The values in parentheses are obtained by enforcing the zero overlap condition 
mentioned after (64) [40]. In that case the Skyrme parameter has slightly been readjusted to e=3.9. 
All data are in MeV. 


Baryons 

Model 

Expt. 

A-N 

154 (163) 

177 

B-N 

242 (264) 

254 

E-N 

366 (388) 

379 

A-N 

278 (268) 

293 

S* - TV 

410 (406) 

446 

E*-N 

544 (545) 

591 

n-N 

677 (680) 

733 


model reasonably accounts for the empirical mass differences. The original studies [48, 49, 50, 44] 
yielded far too low mass splittings between baryons of different strangeness for physically motivated 
parameters of the effective Lagrangian*^. A major reason for the improvement is the fact that 7 
is significantly enlarged by including the effects associated with Jk U [45]. It is also apparent 
from table 1 that enforcing the zero overlap condition for the induced kaon components can be 
compensated by a small variation of the Skyrme parameter, e. This indicates that possible double 
counting effects play only a minor role. It is interesting to remark that the mass differences for the 
1"^ baryons deviate strongly from the predictions in leading order of the flavor symmetry breaking. 
This can easily be observed from the ratios 


(Ma - Miv) : (Ms - Ma) : (Ms - Ms) = 1 : 0.52 : 0.85 (65) 

^Many of these authors considered /tt as a free parameter fitted to the absolute values of the baryon masses. 
Without the term this yielded as low as 25MeV [49]. 



354 


Schechter and Weigel 


which are in much better agreement with the experimental data (1:0.43:0.69) than the leading order 
result (1:1:0.5). Obviously the higher order contributions are important. This also indicates that 
the baryon wave-functions contain sizable admixture of higher dimensional SU (3) representations, 
cf. eq (59). Nevertheless the deviation from the Gell-Mann-Okubo relations (62) is only moderate, 
in particular the equal spacing among the baryons is well reproduced. Finally we note that, as 
discussed in point (iii) of section 2, the absolute mass of the nucleon is also too high in the three 
flavor case. Again we must rely on the Nq corrections mentioned. 


4*3 Electromagnetic properties of 



baryons 


The value for the Skyrme parameter e = 4.0 obtained from this best fit to the baryon mass 
differences is next employed to evaluate static properties of baryons within this model. In order to 
do so one first constructs the Noether currents associated with the symmetry transformation (23). 
A convenient method is to extend these global symmetries to local ones by introducing external 
gauge fields {e.g. the gauge fields of the electroweak interactions) into4:he total action i.e. (1), 
(5), (26) and (27). The Noether currents are then read off as the expressions which couple linearly 
to these gauge fields. This procedure is especially appropriate for the Wess-Zumino term (26) 
because this non-local term can only be made gauge invariant by a lengthy iterative procedure 
[30, 32]. The final form of the nonet {a = 0, ...,8) vector (V^^) and axial-vector (A^) currents 
reads [46] (for Nc = 3) 


- ± } 

{g“ {{UM + MU\a^} T {MU + U^M,l3„})}, (66) 

where = (|, ^, . . . , ^) denote the Hermitian nonet generators. The combination 

(67) 

is of special interest because it enters the computation of the electromagnetic properties. The 
associated form factors of the baryons (B) are defined by 






w(p) , 


QfjL — Pn Pn 


(68) 


Frequently it is convenient to introduce “electric” and “magnetic” form factors 

G|(g^) = Ffiq^) - , G^) = Ff{q^) + F^q^) 


(69) 


Substituting the rotating hedgehog configuration into the defining equation of the currents (66) 
yields for the spatial components of the vector current® 

's/s 

= Vi{r)eijkXjDak + ^B{r)eijk^ljXkDa8 -f V2{r)eijkXjddapDaa^p 

+V3(r)eijkXjDsBDak + V4{r)€ijkXjdda/3DsaDa(3-^ , (70) 


where 




( 71 ) 


®The conventions axe i, j, /b = 1, 2, 3 and a, = 4, . . . , 7. 



Skyrme Model for Baxyons 


355 


Table 2: The electromagnetic properties of the baryons compared to the experimental data. The 
predictions of the Skyrme model are taken from ref [46]. 


B 

MB(n 
e = 4.0 

■m.) 

Expt. 

e = 4.0 Expt. 

r|;(fm^) 
e = 4.0 Expt. 

P 

2.03 

2.79 

0.43 

0.74 

0.59 

0.74 

n 

-1.58 

-1.91 

0.46 

0.77 

-0.22 

-0.12 

A 

-0.71 

-0.61 

0.36 

— 

-0.08 

— 

S+ 

1.99 

2.42 

0.45 

— 

0.59 

— 

IP 

0.60 

— 

0.36 

— 

-0.02 

— 

I- 

-0.79 

-1.16 

0.58 

— 

-0.63 

— 

5° 

-1.55 

-1.25 

0.38 

— 

-0.15 

— 

nr**" 

-0.64 

-0.69 

0.43 

— 

-0.49 

— 

E® -+ A 

-1.39 

-1.61 

0.48 

— 

— 

— 


is the baryon number density (3). The explicit form of the radial functions Vi (r), . . . , V4(r) is given 
in appendix B of ref [46]. According to the quantization prescription (54) the angular velocities Qa 
are replaced by their expressions in terms of the right generators Ra of 5C/(3). Taking the Fourier 
transform of the resulting matrix elements allows one to identify the magnetic form factor in the 
Breit frame [51, 52] 


G^uiQ^) 


-SttMb j^T2dr|^ji(rlg|)|yi(r)(De3>B - ^B{r){D,sR8)B (72) 

--^V2ir){d3a0DeaR0)B + ^(7’)(X>88-De3)B + V4(r)(d3Q^Dec«-D8^)B | • 


Here the flavor index e refers to the “electromagnetic” direction (67), The magnetic moment 
corresponds to the magnetic form factor at zero momentum transfer /xb = ^£-(0). Similarly the 
electric form factor is given by Fourier transforming the time component of the electromagnetic 
current 


Gi 


=47 t r 

Jo 


^drjo{r\q\) 


-B{r){D,3)B + ^Vr{r){DeiRi)B + i^Vs{r){DeMB ^ (73) 


^2 


The two new radial functions Vj{r) and V^{r) are listed in appendix B of ref [46] as well. Integrating 
P7 and Vs yields the moments of inertia, and respectively. Hence the electric charges are 
properly normalized. It should be rem^ked that the baryon matrix elements in the space of the 
collective coordinates are computed using the exact eigenstates of (60) and adopting the Euler- 
angle representations for the 5{7(3) generators [40]. The results for the magnetic moments and 
the radii 


2 6 dGU<f) 

= 


fiB dq^ 


q^=o 


. 2 .dGi(q^) 


q^=o 


(74) 


axe shown in table 2. As in the two flavor model [6] the isovector part of the magnetic moments 
is underestimated while the isoscalar part is reasonably well reproduced. Despite the fact that the 
flavor symmetry breaking is large for the baryon wave-functions, the predicted magnetic moments 
do not strongly deviate from the SU (3) relations [53] 


Me+ 

2 ma 


Mp 1 Me” = ’ 


y^E Ms- 1 

= . 


— (Ms+ + Me-) — —Sme” — Mn — Me“ 


(75) 



356 


Schechter and Weigel 


A more elaborate treatment of the flavor symmetry breaking is necessary in order to accommodate 
the experimentally observed details of breaking the J 7 -spin symmetry which e.g. causes the ap- 
proximate identity /is+ ^ fip [54]. The moderate differences between the various magnetic radii 
is a further hint that symmetry breaking effects are mitigated. The comparison with the available 
empirical data for the radii shows that the predictions turn out too small in magnitude (except for 
the neutron electric radius). This is a strong indication that essential ingredients are still missing 
in the model. In section 5 it will be explained that the effects, which are associated with vector 
meson dominance (VMD), will account for this deficiency. Nevertheless the overall picture gained 
for the electromagnetic properties of the and I**" can at least be characterized as satisfactory, 
especially in view of the fact that the only free parameter of the model has been fixed beforehand. 

4.4 Effects of symmetry breaking on baryon matrix elements 
and strangeness in the nucleon 

Theorists typically wish for symmetry breaking effects to be negligible (the so-called “spherical 
cow” approximation) but Nature says otherwise. This is very apparent in the case of low energy 
strong interactions (QCD). The Gell-Mann-Okubo mass formulae, which amount to applications 
of a Wigner-Eckart theorem for first order Xs type symmetry breaking, furnish sum rules rather 
than a complete description. As we discussed in section 4.2, the Skyrme model provides a non- 
trivial playground for treating symmetry breaking. The Yabu-Ando equation (60) gives an exact 
(within the model) wave-function for each baryon state at any strength of the symmetry break- 
ing parameter 7 /?^ (oc underlying quark masses). The physical results vary smoothly with 
(although higher quantum corrections would be expected to give weak non-analytic corrections). 

The physical interpretation of symmetry breaking in the model may be seen from eq (59). The 
higher 517(3) representation components in the baryon wave-function can only emerge in a quark 
framework by having quark-antiquark pairs present in addition to the three “valence” quarks. 
Clearly such effects would be difficult to treat in the non-relativistic quark model approach. On 
the other hand, it should be recognized that the Skyrme model approach is based on a collective 
semi-classical treatment. 

In the last few years there has been a greatly renewed interest in the study of symmetry breaking 
effects for ordinary nucleons. This was stimulated by new experiments on polarized lepton deep 
inelastic scattering off nucleons [ 55 ] which seem to indicate that the pure valence quark picture of 
the nucleon- has serious drawbacks. The Skyrme model has the advantage of giving a simple and 
roughly accurate quantitative explanation of these experiments. In detail one needs the strangeness 
conserving proton matrix elements 

{P(p')\li7ix'r5qi\P(p)) = u(p') 75u(p) (76) 

for this discussion. Of related interest are the flavor changing matrix elements 

{B’{p')\v^\B{p)) = u(p') + • • •] y^ip ) , 

{B' {p‘)\A^\B{p)) = u(p0 [T/x755'^(^^) + -’-] w(p) • (77) 

between different baryons (S',B). Here we have omitted contributions proportional to the mo- 
mentum transfer 

Knowledge of the gA{B^B^) and gv{B,B^) are crucial for the theory of baryon semi-leptonic 
decays. First let us consider the calculation of the axial vector matrix elements gA{B,B'). Our 
main interest in this brief discussion will be to examine the effects of symmetry breaking. The 
leading order term (in 1 /Nq) of the spatial components of the axial current is straightforwardly 
obtained to be 

I dhA^=CDai{A) , (78) 

where A{t) is the collective coordinate matrix. The constant C denotes an integral over the chiral 
angle. We refer the interested reader to refs [57, 46] for the explicit expression. Then, 

g%{B',B}=C{B'\Pa 3 \B) . 


(79) 



Skyrme Model for Baxyons 


357 


Table 3: The matrix elements of the axial-vector current (79) between different baryon states in 
the flavor symmetric limit. Displayed are both the strangeness conserving (a) and strangeness 
changing (b) processes. The first column gives the relevant flavor component of the axial current. 



n p 

E- A 

(a) 

s- -^S° 

E- -^E° 


F + D 


^/2F 

D-F 


A p 

S — y 71 

(b) 

S- A 

=- -jE” 

A^- 

^{3F + D) 

D-F 

^,{^F-D) 



The flavor index a has to be chosen according to whether strangeness conserving (a = 1, 2, 3, 8) or 
strangeness changing (a = 4, . . . , 7) processes are being considered. The corresponding result for 
the axial charge of the nucleon qa = as measured in neutron beta-decay, is predicted 

too low in many soliton models. This problem is already encountered in the two flavor model and 
gets worse in SU{3) as the Clebsch-Gordon coefficient associated with jDi+i2 3 changes by a factor 
of 7/10. As symmetry breaking is increased the 5?7(3) prediction for gA becomes larger [42] 

OAiSUm = ^ [1 + 0 . 05147/32 + . . .] gA{SUi2)) . (80) 

Actually the exact treatment shows that with increasing symmetry breaking the two flavor result is 
approached, although only slowly. Taking everything together, including subleading terms in (79), 
finally gives qa = 0.98 for e = 4.0 [46] which is about 4/5 of the experimental value ^^(expt.) = 
1.26. 

We may understand the tendency to approach the SU{2) limit for large 7/?^ as follows. In the 
small 517(3) breaking case, there is just a small extra “cost” for producing an ss pair rather then 
a uu or dd pair. As 7/?^ gets larger it is more expensive to make an ss pair and eventually ss pairs 
should be absent from the nucleon wave-function, recovering the SU{2) picture. 

Returning to the general case one should first note that flavor symmetry relates the octet axial 
current matrix elements between various baryons. Conventionally they are expressed using 5C7(3) 
covariance in terms of two unknown constants (or reduced matrix elements) F and D. One has 
to use models to determine these constants. In the flavor symmetric Skyrme model one finds [53] 
D/F = 9/5 and D -f F = 7C/15 = qa- Iu table 3 the flavor symmetric dependences of the axial 
matrix elements on F and D are displayed. As one departs from the flavor symmetric case the 
baryon wave-functions acquire admixture from higher dimensional SU (3) representations making 
the SU(3) covariant parameterization in terms of F and D inadequate. In the presence of 5/7(3) 
symmetry breaking we must, without further assumptions, parameterize each decay amplitude 
separately. It is still reasonable to maintain the isospin invariance relations. 

As an example of the perturbative corrections consider the axial A -> p transition in the 
Cabibbo scheme [56] for semi-leptonic hyperon decays. The analog of (59) for the A hyperon is 

|A) = |A,8> + ^7/3"|A,2r) + ... . (81) 

Noting that the D-functions mix different SU{3) representations, we get 

(p t IDk-sI^ t) = ^ - + ■■■, Dk-z = ^ {D,z - iDsz) • (82) 

Of course, this expansion just provides a first approximation to the symmetry breaking dependence 
of the Cabibbo matrix elements. Using the exact treatment initiated by Yabu and Ando [44] 
this dependence can be computed numerically as shown in figure 2 for some processes of interest 
[57, 58]. Those results are normalized to the SU (3) symmetric values in table 3 to illustrate that the 



358 


Schechter and Weigel 


< B’ I A% I B > 



Figure 2: The variation of axial vector matrix elements with the effective symmetry breaking 
parameter 7/?^. 

Full line: (plsysysslp); dashed dotted line: (p|w73755|A); dotted line: (n|u7375s|E”); long dashed 
line: (A|u7375s|H“); dashed line: {p\uj3%d\n). These matrix elements, which are taken from refs 
[ 57 ] and [ 58 ], are normalized to the flavor symmetric values. 


matrix elements vary in different ways with symmetry breaking. In this figure also the variation 
of the nucleon matrix element of the flavor conserving axial current Hs{Q) = {iV|?7375s|iV) is 
displayed. Obviously ^^3(0) decreases very rapidly with increasing symmetry breaking. This is 
easily visualized, as mentioned above, as a reflection of the increased cost of making extra $s pairs 
in the nucleon wave-function as 7/?^ increases. On the contrary the Cabibbo matrix elements 
exhibit only a moderate dependence on 7^^. It is this different behavior of the matrix elements 
that makes the application of exact flavor symmetry to the analyses of the EMC-SLAC-SMC 
experiments suspicious. Stated otherwise, the strange quark contribution to the nucleon matrix 
element of the axial singlet current (loosely “proton spin”) may be decreased significantly as a 
consequence of symmetry breaking without contradicting the successful Cabibbo scheme for the 
semi-leptonic decays of the hyperons. 

As an interesting contrast to the axial matrix elements, consider the evaluation of the vector 
matrix elements gv{Bj B') needed for the hyperon semi-leptonic decays. The dominant contribu- 
tion is given by the matrix elements of the SU{S) flavor generators 

g^{B\B) = {B'\La\B) . ( 83 ) 

For example if we sandwich the generators between the perturbative A and p states given in 
( 81 ) and (59) and recognize that group generators can only connect states belonging to the same 
irreducible representation, we see that symmetry breaking corrections start out as (7/?^)^ rather 
than 7 yd^. This is just a demonstration of the Ademollo-Gatto theorem [59], which “protects” 
the vector matrix elements against small symmetry breaking corrections. Since 7/?^ is large the 
numerical validity of this result is questionable. However, the exact Yabu-Ando scheme does 
confirm [57] that vector matrix elements suffer at most, 10 % deviation from the symmetric values, 
even for large symmetry breaking, e.g. 7/?^ « 7. 

A reduction of strangeness in the nucleon with increasing 7^^ is also predicted for the scalar 
strange content fraction of the proton 




{p\s$\p) - (0k^|0) 

(p\un -h dd + ~ (0\uu + dd + ss(0) 


( 84 ) 


Here the state jO) refers to the soliton being absent. Models of quark flavor dynamics, as e.g the 
one of Nambu-Jona-Lasinio [ 9 ], indicate that matrix elements of quark bilinears qXaq may be 




Skyrme Model for Baryons 


359 


taken as proportional to the matrix elements of tr [Aq (U + W — 2)] . Then we straightforwardly 
get 


X, = i(p|l - Dss\p) « ^ + • • • . (85) 

In this case, however, the deviation from the flavor symmetric result [60] (Xg = 7/30) is consider- 
ably mitigated [61] as compared to the variation of Hg defined in (76). The symmetry breaking has 
to be as large as 7 / 3 ^ 4.5 to obtain a reduction of the order of 50%. In the case of Hg this was 

already achieved for 7 /?^ ^ 2.5. In any event, the additional quark-antiquark excitations in the 
nucleon, which are parametrized by the admixture of higher dimensional SU (3) representations 
(59), clearly tend to cancel the virtual strange quarks of the pure octet nucleon. 

The three flavor Skyrme model under present consideration provides a convenient way to study 
the nucleon matrix elements of the vector current 57^5 . These are theoretically interesting because 
they would vanish in a pure valence quark model of the nucleon and so test finer details of nucleon 
structure. They are experimentally interesting because they can be extracted from measurements 
of the parity violating asymmetry in the elastic scattering of polarized electrons from the proton. 
The precise form factors needed are defined by 


{P(p')k,'y,i<is\P{p)) = u(p') 




<r^vq'' 

2Mp 


Fsiq^) 


u{p) . 


(86) 


These form factors are currently under intensive experimental investigation, cf. refs [62] and have 
been estimated in various models. The models range from vector-meson-pole fits [63] of dispersion 
relations [64] through vector meson dominance approaches [46] and kaon-loop calculations with [65] 
and without [ 66 ] vector meson dominance contributions to soliton model calculations [46, 67, 68 ]. 
The numerical results for the strange magnetic moment fis = Fs(0) » —0.31 ± 0.09 . . . 0.25 are 
quite diverse. The predictions for the strange charge radius = —&dFs{q^)/dq^\q=o are almost 
equally scattered r| « -0.20 . . . 0.14fm^. 

In order to evaluate these form factors in the three flavor Skyrme model one requires the matrix 
elements of the “strange” combination 

«•= 5'- 


rather than the electromagnetic one (67) between proton states. Using the same value e = 4.0 as 
used consistently for the three flavor pseudoscalar model yields the predictions 

PS = -O.lSn.m. , r| = -O.lOfm^ . (88) 


Here n.m. stands for nuclear magnetons. It should be stressed that these results are obtained 
within the Yabu-Ando approach, Le. the proton wave-function contains sizable admixture of 
higher dimensional representations. If a pure octet wave-function were employed to compute 
the matrix elements of the collective operators the strange magnetic moment would have been 

= -0.33. The proper inclusion of symmetry breaking into the nucleon wave-function is again 
seen to reduce the effect of the strange degrees of freedom in the nucleon. We already discussed 
above that as the strange quarks within the nucleon become more massive (effect of symmetry 
breaking) their excitation becomes less likely. 

In the next few years a great deal of new experimental information on the form factors Fg 
and Fg should become available. This would enable more accurate comparison with (for a given 
effective meson Lagrangian) the essentially parameter free predictions of the soliton theory. 


5 The nucleon as a vector meson soliton 

According to the modern view of the Skyrme model approach we should start from the “full” 
effective Lagrangian which contains mesons of all spins. The practical criterion on which particles 



360 


Schechter and Weigel 


to include is to find an effective Lagrangian which does a good job of explaining the low energy 
experimental data in the meson sector. On these grounds it is evident that the vector mesons 
should be included in the effective Lagrangian. In this section we give a brief sketch of the soliton 
sector of the Lagrangian of pseudoscalars and vectors (see section 3.3) and note that it leads to 
significant improvements of many predictions. In particular it is crucial for discussing the so-called 
proton spin puzzle. 

5.1 Generalized soliton ansatz and profile functions 

As a first step we construct the soliton of the Lagrangian defined in section 3.3. The general- 
ization of the hedgehog ansatz (7) to the vector meson model requires the time component of the 
u! field and the space components of the p field to be different from zero. Parity and grand spin 
symmetry® allow for three radial functions 

^TT = exp • rP’(r) j , wo = ■ (89) 

Substituting these ansdtze into the action described in section 3.3 yields the classical mass, 

E = 47rydr[|(FV + 2sin2F)-^(a;'®-bmy) + ^[(?'2 + |^(G-b2)2] 

-t-~(l + G - cosF)^ + —F'wsm^F — ^^G'uisinF 
9^ 9 9 

+'^F'u}G{G + 2) + i (72 + Tsl-P'^ri - 2{G + l)cosF -I- cos'^F] 

9 9 

+(1 - cosF){45V® -h 2(2^' - ^)iF'^r^ + 2sm^F) 

[wV® - 2(G + 1 - cosF)2 - 4(1 -(- cosF)(l + G- cosF)] }] . (90) 

Application of the variational principle to this functional yields second order coupled non-linear 
differential equations for the radial functions F(r), a;(r) and G{r). The boundary conditions for the 
chiral angle F{r) = tt and F{oo) = 0, which correspond to unit baryon number, also determine the 
boundary conditions of the vector meson profiles via the differential equations and the requirement 
of finite energy. For example we find G(0) = — 2. A typical set of resulting profile functions is 
shown in figure 3. 



Figure 3: The profile functions which minimize the classical vector meson energy functional (90) 
for the parameters g = 5.85, h == 0.4, gvv^ = 1*9 Q-n-d « = 1.0. u)(r) is measured in units of 
my = V2gf^, 


^The grand spin J +I is the characteristic invariance of the hedgehog ansatz. 



Skyrme Model for Baryons 


361 


As in the pseudoscalar model we have to generate states with good spin and isospin from this 
classical field configuration. To start with, one introduces collective coordinates (9) for all fields 
which have non-vanishing spin or isospin. However, an additional complication arises because 
there are vector meson field components which vanish classically but get excited by the collective 
rotation. In the two flavor case the appropriate ansatz for these excitations reads 

Po = [^ 1 + S 2 (r)(r • fi)r] A^t) , uji = , (91) 

where the angular velocity of the rotating soliton, fli is defined in (11). The three radial functions 
^ 1,^2 and $ are not the only ones which get excited. As these radial functions are non-zero 
they provide sources for the non-strange component of the iso-singlet pseudoscalar field via the 
e-terms (43). In the two flavor formulation the appropriate ansatz which takes into account the 
pseudoscalar nature of the rj field reads 

U{r^t) = A{t)UQ{r)A^ (t) with rjTir) = ■^r){r)r • f2 . (92) 

/tt 

As we will observe shortly, the excitation of this 77 field plays a decisive role in the context of the 
proton spin puzzle. The additional radial functions are determined from extremizing the moment 
of inertia for rotations in coordinate space, 

~ T / [(6 + 6 )^ + 2(6 - 1 + cosF)^] 

[mi + 2^2 + ^2^ + 2(G" + 2G + 2)^1 + AG^ei + 66 - 26 - 6 + 1)] 

+-'yi(j>F'sm^F + -"izcfiF' [(G - 6)(1 “ cosF) H- (1 - cosF)^ - G 6 ] 

9 9 

{«i'sinF(G - 6 + 2 - 2 cosF) + (?isinF (6 - G') 

+ 0 F' [2 + 2 sin 2 F + (6 - G - 2 )cosF - 2(6 + 6 )] } 

+ 2 ??^ + rnlr^rf] A ^ [7?((^' - w<i>') - n' 

[» 7'(6 + 6 )sin"F + 2riF'{G + 6 )smF] - ^-r}'{G + 1 - cosF)2(6 + 6 ) 

gjir 

W [{G + 6 )G + (6 + 6)[(1 - cosFf - 2 GcosF]] + r?(G 6 - G' 6 )} | , (93) 

together with suitable boundary conditions. In eq (93) we have not displayed the explicit contri- 
butions from the symmetry breakers (which are in fact small). We will mostly limit the present 
discussion to the two flavor case. In the case of three flavor vector meson models the situation is 
even more complicated as also K* type fields get excited. Also there will be additional symmetry 
breakers on the level of the collective Lagrangian which are of the form stem 

from terms which are linear in the time derivative. They can straightforwardly be implemented in 
the collective quantization approach. Here we will omit details but rather refer the reader to the 
literature [67, 40]. The general pattern for computing baryon properties is essentially the same as 
that discussed for the Lagrangian of only pseudoscalars in section 4. 

5.2 Axial singlet current and proton spin puzzle 

Notice that in (93) we included by hand a mass term for the rotationally excited profile 7 j(r) 
of a pseudoscalar isosinglet field. Actually the existence of such a term has not yet been justified. 
Before proceeding we must do so since the term turns out to be very important. 



362 


Schechter and Weigel 


In section 3.1 we mentioned that the QCD axial singlet current 


= u7;,75U + dy^'y^d + 57^*753, (94) 

is not conserved even for zero quark masses: = G, where the 17^(1) anomaly G is propor- 

tional to the product of the QCD field strength tensor and its dual. In order to mock up this 
non-conservation equation at the effective Lagrangian level [69] we may add the terms 


z _o iG ^ 


( detU\ 

UetC/tj ’ 


(95) 


where G is now considered a composite glueball field which “dominates” the Ua{1) anomaly. 
Here we assumed three light flavors and also that the strong CP violation parameter 6 is zero. 
Furthermore it is necessary that, except for the terms representing quark mass symmetry breaking, 
all the other terms in the eflFective Lagrangian be invariant under The parameter c above 

is determined by 


m 


2 ^ 
rjQ ^ 



(96) 


in the approximation where the quark mass terms are neglected, rjo is the SU{S) singlet pseu- 
doscalar field as in (45). This equation arises after noting that G is like an auxiliary field and may 
be integrated out: G = Vo/i^^fn) • In the effective Lagrangian the realization of the axial singlet 
current, obtained by a Noether variation, is 

42 = V6f„d^r,o + jZ ■ ( 97 ) 

Here the first term is the contribution from the pseudoscalar field and the second term is due to 
the addition of vector fields. has a complicated structure but, in particular, contains non- 
derivative terms like e^^°^^ti{pupaPi 3 )- Using this decomposition we may write the equation of 
motion for the 770 field as 

(a^ + m,|)% = ^a'^42- (98) 

which shows that the vector meson contribution to the axial singlet current may act as a source 
for a non-trivial excitation associated with the '770 field in the soliton sector. 

Now the form factors for the proton matrix elements of the axial singlet current are obviously 
just the sums of the three separate form factors introduced in (76): 

H{q^) = Hu{q^) + Hd{q'^)+H,iq^) and H^q^) = Hu{q^) + Hdiq'^) + Hsiq^). (99) 


If the vector mesons are not present, eq (97) shows that the operator for the axial singlet current 
must be (even in the soliton sector) a pure derivative. This means that, regardless of the details 
of the calculation, the matrix element for the sum of the three terms in (76) must be proportional 
to the momentum transfer . Thus H{q^) is non-zero and H{q^) = 0. Prom the theory of Dirac 
particles we recognize that the quantity iJ(0) has the interpretation of twice the quark spin part 
of the proton’s angular momentum. We see that the Skyrme model of only pseudoscalars predicts 
that the expectation value of the net quark spin operator vanishes; the total angular momentum 
(1/2) of the proton must involve, at a fundamental level, the rotational and gluonic pieces! Note 
that the above argument for H’(O) = 0 with the Lagrangian of only pseudoscalars continues to hold 
even if symmetry breaking contributions are taken into account [58]. 

The situation is a little different when vector mesons are included in the effective Lagrangian. 
Since has pieces which are not pure derivatives it then is possible to obtain ff(0) ^ 0 . A 
convenient parameterization for this calculation in the effective Lagrangian model is 


( 100 ) 



Skyrme Model for Baxyons 


363 


Once all the radial functions have been determined as before from the appropriate variational 
principles, it is straightforward to compute II{0) from eq (100). One only has to recall that under 
the collective coordinate quantization the angular momentum operator is given by J = The 

numerical results for a variety of allowed parameters (c/. discussion after eq (43)) are displayed in 
table 4. Surprisingly the predictions of the vector meson model for H (0) are very robust against 


Table 4: Predictions for the matrix element of the proton axial singlet current for various allowed 
sets of parameters in the vector meson model. Results are taken from ref [70]. 


h 

0.4 

0.4 

0.4 

0.7 

0.5 

0.2 

0.1 

gvv^ 

1.9 

1.9 

1.9 

2.2 

2.0 

1.7 

1.5 

K 

0.0 

0.5 

1.0 

0.0 

0.0 

0.0 

0.0 

Hm 

0.34 

0.33 

0.30 

0.29 

0.34 

0.32 

0.28 


possible changes of the parameters of the model. 

Even though we get a non-zero value for H{0) in the vector meson model it is still small 
compared to i7(0) = 1, the expectation from the simple non-relativistic quark model. Qualitatively 
the soliton model results with and without vectors are similar. Since one has a natural prejudice 
that the quark model results should be roughly correct, this would at first seem to be a serious 
defect of the soliton approach to nucleon properties. 

Of course one can only make an accurate judgment on the matter by appealing to experiment. 
H (0) can be found from eq (99) if we can experimentally obtain separately Hu{0)^Hd{0) and ^^(O). 
The linear combination 


ir,,(0)~H^(0) = p^ = 1.257, (101) 

is reliably obtained by an isotopic spin rotation of the axial form factor describing neutron beta- 
decay. Similarly the estimate for the “eighth” octet component 

Hu(0) + Hd(0) - 2^,(0) « 0.575 ± 0.016, (102) 

may be gotten from an 5(7(3) flavor rotation of the data on hyperon beta-decay experiments. 
Clearly one more linear combination is needed in order to disentangle the individual Hi(0) and 
that situation existed for many years. About ten years ago different experimental groups (EMC, 
SLAC, SMC) used polarized lepton beams to probe the structure of nucleons. The deep inelastic 
scattering data [55] were used to extract the combination 

4Hu{0)-}-Hd{0) + Hs(0), (103) 

in which the axial current form factor for each quark is weighted proportionally to the square of 
the quark electric charge. Combining these data resulted in if (0) « 0.3 [71]. The experimental 
results were later on confirmed [72, 73, 74]. However, it turns out that the theoretical extraction 
of if(0) is quite complicated as it involves a careful treatment of perturbative QCD corrections. 
The value [75] 


ff(0) = 0.27 ±0.04 (104) 

is nowadays considered correct. At the time this low value was considered hard to understand and 
the situation was called the proton spin puzzle. We have just seen that the soliton approach does 
however provide a simple explanation of such a low value. 

Clearly the prediction of the vector meson treatment described in Table 5.1, yielding H(0) 
about 0.30, is in good agreement with the data. From this we learn two things. First, the simplest 
quark model does not give a good description of the spin structure of the nucleon. Second, the 
soliton approach based on an effective Lagrangian including vector mesons markedly improves 
the. qualitatively reasonable predictions of the soliton treatment based on a pseudoscalars only 



364 


Schechter and Weigel 


effective Lagrangian. A physical interpretation of the latter statement is that the pseudoscalars 
only Lagrangian mainly probes the “pion cloud” of the nucleon while the vector Lagrangian probes 
a little more deeply. 

For completeness we remark on a possible caveat. The estimate of (102) is based on the use 
of exact 517(3) symmetry. However in Fig 4.1 of section 4.4 we showed that precisely this current 
matrix element is expected to exhibit stronger suppression than others due to 5C/(3) symmetry 
breaking. Nevertheless it turns out that [70] the numerical evaluation of H (0) is not very sensitive 
to this feature. This is to be contrasted with the behavior of Hs{0), which decreases rapidly with 
symmetry breaking, cf. section 4.4. 

5-3 Other improvements with vector mesons 

The famous problem of explaining the neutron-proton mass difference is another one which 
requires the addition of vector mesons to the effective Lagrangian in order to obtain a satisfactory 
solution in the nucleon-as-soliton picture. It is known that the electromagnetic (i.e. one photon 
loop) contribution has the wrong sign. After correcting for the electromagnetic interaction the 
remaining “strong” part of the neutron-proton mass-difference should be (M^ - Mp)strong ^ 
(2.0i:0.3)MeV [76]. At the quark level this arises from the down quark-up quark mass difference, 
controlled by the parameter y in eqs (20) and (21). Information on y can be most easily gained by 
analyzing the mass-difference, yielding y « (—0.4... —0.2) [33]. To understand the problem 

it is helpful to consider the contribution of the (presumably dominating) d'-type symmetry breaker 
to the neutron-proton mass-difference. Since the d-u quark mass difference clearly exists with only 
two flavors it is interesting to first consider the problem at this level. Then the relevant piece of 
the term is proportional to 


tr [t 3 {U + C/^)] . (105) 

Using the ansatz (92) we see that U = exp{ir)T)[cos{^p) H- in • Tsin{^p)], where is some angle. 
Then (105) is proportional to sin(7/T'). In other words the contribution vanishes unless the field 
rjT gets excited due to the collective rotation (or any kind of symmetry breaking). Now (98) 
together with (97) shows that this will not happen if only pseudoscalars are present in the effective 
Lagrangian; the vector meson contribution must also be present. This is analogous to the 
situation concerning the proton spin puzzle. The contribution of the 5' term turns out to be 

(M„ - Mp)strong = j smF{r)r]{r) + ... . (106) 

Using the full two-flavor vector meson result for which was already employed to compute H{0) 
yields [77] 


(Mn - Mp)strong « 1.4 MeV (107) 

which, not surprisingly, turns out to be about as robust against changes of the parameters as is 
H{0). This prediction is still somewhat too small when compared to the empirical value. However, 
it turns out that the missing 0.5MeV can be attributed to three flavor effects as matrix elements 
of Dzs are non-vanishing^®. 

The addition of vector mesons also plays an important role in the discussion of the “sizes” of the 
nucleons: the nucleon radii. As can be observed from table 2 the Skyrme model of pseudoscalars 
only seriously underestimates the empirical values for the baryon radii. The presence of the uj 
meson provides an increase of the isoscalar radius [52] 

(r^)/=o « (r^)s + A , (108) 


should be remarked that (p\Dss\p) quickly approaches zero as SU{Z) symmetry breaking is increased. This 
decreases SU{Z) type contributions to (107). 



Skyrme Model for Baryons 


365 


where {t^)b is the radius associated with baryon number current (3). The additional piece in eq 
(108) is a consequence of (approximate) vector meson dominance in this model [78], which indeed 
is observed when including the vector mesons in a chirally invariant manner. As can be seen from 
table 2 this increase of about 0.35fm^ will significantly improve the predictions for the radii. 

A similar interesting improvement due to vector mesons is obtained in the context of meson- 
baryon scattering. In these investigations one introduces small fluctuations off the classical soliton. 
Eventually these fluctuations are quantized to represent in- and out-going meson fields, thereby 
determining the scattering matrix [79]. It turns out that in the pseudoscalar Skyrme model the 
phase-shifts extracted from this scattering matrix rise almost linearly with the momentum of the 
in-going pion. This undesired feature is mostly due to the contact interaction between pions 
contained in the Skyrme model Lagrangian (c/. section 2). When introducing vector mesons this 
contact interaction is essentially replaced by the exchange of such a vector meson. 



As this interaction decreases for large momentum transfers, the resulting phase-shifts assume 
a constant value for large energies rather than rising linearly [80]. Clearly this effect is similar to 
the one observed when going from the Fermi to the standard model of electro-weak interactions. 

These examples show that while the inclusion of vector meson degrees of freedom involves quite 
a few technical details it clearly provides a more realistic picture of the nucleon as a chiral soliton. 


6 Summary and discussion 

Aside from the mass spectra and current matrix elements of the low-lying 1"^ and 1"^ baryons 
treated here the soliton approach has been extensively employed to study meson nucleon scatter- 
ing [79, 80], baryons containing a heavy quark [81], nucleon-nucleon scattering [82], few nucleon 
systems [83] and nuclear matter [84]. 

In the present survey, we started out with a historical introduction (section 1) and a concise 
technical summary of the original two flavor Skyrme model (section 2). In these sections the 
physical interpretation and justification of the model were emphasized; it is hoped that they will 
be useful to beginners in this area of research (see also [85] and [86]). 

We next attempted to develop the generalization of the original Skyrme model which is sug- 
gested by the large Nc approximation to QCD. In this approach the Skyrme Lagrangian is to be 
replaced by a more general effective Lagrangian containing mesons of all spins. Perhaps some day 
an analytic expression in this framework will be found. At present it seems necessary to obtain 
an approximation based on including the lowest energy resonances and constraining the model by 
the symmetries of the underlying QCD. The concept of chiral symmetry which plays a crucial role 
in this extension was explored in section 3. Furthermore the original Skyrme model of two light 
flavors was extended to three flavors, as it is now well established that the nucleons belong to a 
flavor SU(3) multiplet. 

It is worthwhile to stress that once the effective Lagrangian has been determined from the 
meson sector, the soliton approach provides in principle a zero parameter description of baryon 
properties (In our case we introduced just one parameter which had to be fit from the baryon 
sector.). 

In section 4 we studied the technical tools needed to treat the flavor SU(3) symmetry and its 
breaking at the (collective) baryon level. These were applied to the calculation of various interesting 
baryon matrix elements. Finally section 5 sketched the treatment of baryons based on an effective 
Lagrangian which also included the vector mesons. An application to the so-called proton spin 
puzzle demonstrated that the soliton approach seems to give a neat description of, otherwise hard 
to explain, experimental results on the quark spin structure of the nucleon. The improvements one 
encounters on including the vector mesons are in accord with the intuitive notion that the addition 
of higher mass resonances in the meson sector leads to a progressively more detailed understanding 
of the short distance structure of the nucleon-as-soliton. 



366 


Schechter and Weigel 


We are happy to acknowledge the stimulating interactions we have had with many collaborators 
and colleagues while doing research related to the topics reviewed here. 


References 

[1] The early paper in the series which Skyrme himself evidently believed to be the key one is 
T. H. R. Skyrme, Proc. Roy. Soc. (London) A260, 127 (1961). Some additional properties of 
the nucleons were discussed in T. H. R. Skyrme Nucl. Phys. 31, 556 (1961). A late paper in 
the series is T. H. R. Skyrme, J. Math. Phys. 12, 1735 (1971). 

[2] A nice pedagogical treatment is provided by the text book R. Rajaraman, Solitons and In- 
stantons, (1982), North-Holland. 

[3] J. G. Williams, J. Math. Phys. 11, 2611 (1970). 

[4] N. K. Pak and H. C. Tze, Ann. Phys. (N.Y.) 117, 164 (1979). 

[5] A. P. Balachandran, V. P. Nair, S. G. Rajeev and A. Stern, Phys. Rev. D 27, 1153 (1983). 

[6] G. S. Adkins, C. R. Nappi and E. Witten, Nucl. Phys. B228, 552 (1983). 

[7] M. Gell-Mann and Y. Ne’eman, The Eightfold Way, Benjamin, New York, 1964. 

[8] E. C. G. Sudarshan and R. E. Marshak, Proceedings of the Padua conference on mesons and 
recently discovered particles, p. V-14 (1957). See also R. P. Feynman and M. Gell-Mann, 
Phys. Rev. 109, 193 (1958) and J. J. Sakurai, Nuovo Cimento 7, 649 (1958). 

[9] Y. Nambu and G. Jona-Lasinio, Phys. Rev. 122, 345 (1961); 124, 246 (1961). 

[10] M. Gell-Mann and M. Levy, Nuovo Cimento 16, 705 (1960). See also K. Nishijima, Nuovo 
Cimento 11, 698(1959) and F. Gursey, Nuovo Cimento 16, 230 (1960). 

[11] J. Cronin, Phys. Rev. 161, 1483 (1967); S. Weinberg, Phys. Rev. Lett. 18, 188 (1967). 

[12] S. Weinberg, Physica A96, 327 (1979); J. Gasser and H. Leutwyler, Ann. Phy. (N.Y.) 158, 
142 (1984), Nucl. Phys. B250, 465 (1985). 

[13] Older work on chiral dynamics with baryons is summarized in S. Gasiorowicz and D.A. Geffen, 
Rev. Mod. Phys. 41, 531 (1969). A recent revival is summarized in V. Bernard, N. Kaiser, 
Ulf-G. Meifiner, Int. J. Mod. Phys. E4, 193 (1995). 

[14] G. t‘ Hooft, Nucl. Phys. B72, 461 (1974); B75, 461 (1975). 

[15] E. Witten, Nucl. Phys. B160, 57 (1979). 

[16] T. H. R. Skyrme, Int. J. Mod. Phys. A3, 2745 (1988). This talk was reconstructed by I. 
Aitchison. 

[17] R. H. Dalitz, Int. J. Mod. Phys. A3, 2719 (1988). 

[18] G. H. Derrick, J. Math. Phys. 5, 1252 (1964). 

[19] This is treated in the book W. Pauli, Meson Theory of Nuclear Forces, Interscience Publishers. 
Inc., New York, 1946. 

[20] B. Moussallam and D. Kalafatis, Phys. Lett. B272, 196 (1991); G. Holzwarth, Phys. Lett. 
B291, 218 (1992); B. Moussallam, Ann. Phys. (NY) 225, 264 (1993); G. Holzwarth, Nucl. 
Phys. A572, 69 (1994); H. Weigel, R. Alkofen -and H. Reinhardt, Nucl. Phys. A582, 484 
(1995); F. Meier and H. Walliser, Phys. Rep. 289, 383 (1997). 



Skyrme Model for Baryons 


367 


[21] R. Dashen, E. Jenkins and A. V. Manohar, Phys. Rev. D49, 4713 (1994). 

[22] N. Dorey, J. Hughes and M. Mattis, Phys. Rev. D50, 5816 (1994). 

[23] M. Bander and F. Hayot, Phys. Rev. D30, 1837 (1984); E. Braaten and J. P. Ralston, Phys. 
Rev. D31, 598 (1985). 

[24] C. Caso et al, (Particle Data Group), Eur. Phys. J. C3, 1 (1998). 

[25] For a review see M. Neubert, Phys Rep. 245, 259 (1994). 

[26] M. Gell-Mann, Phys. Rev. 125, 1067 (1962); M. Gell-Mann, R. Oakes and B. Renner, ibid 
175, 2195 (1968). 

[27] S. L. Adler, Phys. Rev. 177 (1969) 2426; J. S. Bell and R. Jackiw, Nuov. Cim. 60A (1969) 
47. 

[28] W. Bardeen, Phys. Rev. 184, 1848 (1969); B. Zumino, Wu Yong-Shi and A. Zee, Univ. of 
Washington preprint 4048-18-P3 (May 1983). 

[29] D. Ebert and H. Reinhardt, Nucl. Phys. B271, 188 (1986). 

[30] E. Witten, Nucl. Phys. B223 (1983) 422, 433. 

[31] J. Wess and B. Zumino, Phys. Lett. 37B, 95 (1971). 

[32] 6. Kaymakcalan, S. Rajeev and J. Schechter, Phys. Rev. D30, 594 (1984); 0. Kaymakcalan 
and J. Schechter Phys. Rev. D31, 1109 (1985). 

[33] J. Schechter, A. Subbaraman and H. Weigel, Phys. Rev. D48, 339 (1993); M. Haiada and 
J. Schechter, Phys. Rev. D54, 3394 (1996). 

[34] S. Callan, S. Coleman, J. Wess and B. Zumino, Phys. Rev. 177, 2247 (1969). 

[35] P. Jain, R. Johnson, Ulf-G. Meifiner, N. W. Park and J. Schechter, Phys. Rev. D37, 3252 
(1988). 

[36] Ulf-G. Meifiner, N. Kaiser, H. Weigel and J. Schechter, Phys. Rev. D39, 1956 (1989). 

[37] M. Bando, T. Kugo and K. Yamawaki, Phys. Rep. 64, 217 (1988). 

[38] J. Carlson, Nucl. Phys. B253, 149 (1985); B277, 253 (1986); P. Jain, J. Schechter and 
R. Sorkin, Phys. Rev. D39, 998 (1989); D41, 3855 (1990). 

[39] A. P. Balachandran, F. Lizzi, V. Rodgers and A. Stern, Nucl. Phys. B256, 525 (1985). 

[40] H. Weigel, Int. J. Mod. Phys. All, 2419 (1996). 

[41] A. V. Manohar, Nucl. Phys. B248, 19 (1984). 

[42] N. W. Park, J. Schechter and H. Weigel, Phys. Lett. B224, 171 (1989). 

[43] J. de Swart, Rev. Mod. Phys. 35, 916 (1963). 

[44] H. Yabu and K. Ando, Nucl. Phys. B301, 601 (1988). 

[45] H. Weigel, J. Schechter, N. W. Park and Ulf-G. Meifiner, Phys. Rev. D42, 3177 (1990). 

[46] N. W. Park, J. Schechter and H. Weigel, Phys. Rev. D43, 869 (1991). 

[47] S. Okubo, Prog. Theor. Phys. 27, 949 (1962). 

[48] E. Guadagnini, Nucl. Phys. B236, 15 (1984). 



368 


Schechter and Weigel 


[49] M. Prasaiowicz, Phys. Lett. 158B, 264 (1983). 

[50] M. Chemtob, Nucl. Phys. B256, 600 (1985). 

[51] E. Braaten, S.-M. Tse and C. Willcox, Phys. Rev. D34, 1482 (1986). 

[52] Ulf-G. Meifiner, N. Kaiser and W. Weise, Nucl. Phys. A466, 685 (1987); Ulf-G. Meifiner, 
Phys. Rep. 161, 213 (1988). 

[53] G. S. Adkins and C. R. Nappi, Nucl. Phys. B249, 507 (1985). 

[54] B. Schwesinger and H. Weigel, Nucl. Phys. A540, 461 (1992). 

[55] J. Ashman et al, Phys. Lett. B206, 364 (1988), Nucl. Phys. B328, 1 (1989). 

[56] N. Cabibbo, Phys. Rev. Lett. 10,531 (1963). 

[57] N. W. Park, J. Schechter and H. Weigel, Phys. Rev. D41, 2836 (1990). 

[58] N. W. Park, J. Schechter and H. Weigel, Phys. Lett. B228, 420 (1989). 

[59] M. Ademollo and R. Gatto, Phys. Rev. Lett. 13, 264 (1964). 

[60] J. Donoghue and C. R. Nappi, Phys. Lett. 168B, 105 (1986). 

[61] H. Yabu, Phys. Lett. B218, 124 (1989). 

[62] B. Muller et al. Phys. Rev. Lett. 78, 3824 (1997); K. Aniol et al. Phys. Rev. Lett. 82, 1096 
(1999). 

[63] R. L. JafFe, Phys. Lett. B229, 275 (1989). 

[64] G. Hohler et al, Nucl. Phys. B114, 505 (1974). 

[65] M. J. Musolf and T. W. Donnelly, Z. Phys. C57, 559 (1993). 

[66] H. Forkel, M. Nielsen, X.-M. Jin and T. Cohen, Phys. Rev. C50, 3108 (1994). 

[67] N. W. Park and H. Weigel, Nucl. Phys. A541, 453 (1992). 

[68] H. Weigel, A. Abada, R. Alkofer and H. Reinhardt, Phys. Lett. B353, 20 (1995). 

[69] C. Rosenzweig, J. Schechter and G. Trahern, Phys. Rev. D21, 3388 (1980); P. Di Vecchia and 
G. Veneziano, Nucl. Phys. B171, 253 (1980); P. Nath and R. Arnowitt, Phys. Rev. D23, 1789 
(1981); E. Witten, Nucl. Phys. B156, 269 (1979); A. Aurilia, Y. Takahashi and D. Townsend, 
Phys. Lett. 95B, 265 (1980); K. Kawarabayashi and N. Ohta, Nucl. Phys. B175, 477 (1980). 

[70] R. Johnson, N. W. Park, J. Schechter, V. Soni and H. Weigel, Phys. Rev. D42, 2998 (1990). 

[71] S. Brodsky, J. Ellis and M. Karhner, Phys. Lett. B206, 309 (1988); J. Ellis and M. Karliner, 
Phys. Lett. B213, 73 (1988). 

[72] D. Adams et al, Phys. Rev. D56 (1997) 5330. B. Adeva et al, Phys. Lett. B412 (1997) 414. 

[73] K. Abe et al, Phys. Rev. Lett. 76 (1996) 587. 

[74] K. Abe et al Phys. Rev. D58 (1998) 112003. 

[75] J. Ellis and M. Karliner, The Strange Spin of the Nucleon, hep-ph/9601280. 

[76] J. Gasser and H. Leutwyler, Ann. Phys. (NY) 158, 142 (1984). 

[77] P. Jain, R. Johnson, N. W. Park, J. Schechter and H. Weigel, Phys. Rev. D40, 855 (1989). 



Skyrme Model for Baryons 


369 


[78] J. Schechter, Phys. Rev. D 345868 (1986). 

[79] H. Walliser and G. Eckart, Nucl. Phys. A429, 514 (1984); A. Hayashi, G. Eckart, G. Holzwarth 
and H. Walliser, Phys. Lett. B147, 5 (1984); M. P. Mattis and M. Karliner, Phys. Rev. D31, 
2833 (1985). 

[80] B. Schwesinger and H. Weigel, Nucl. Phys. A465, 733 (1987); B. Schwesinger, H. Weigel, 
G. Holzwarth and A. Hayashi, Phys. Rep. 173, 173 (1989); 

[81] C. G. Callan and I. Klebanov, Nucl. Phys. B262, 365 (1985); A compilation of references on 
heavy quark solitons may be found in; M. Harada, F. Sannino, J. Schechter and H. Weigel, 
Phys. Rev. D56, 4098 (1997). 

[82] A. Jackson, A. D. Jackson, A. S. Goldhaber, G. E. Brown, and L. C. Castillejo, Phys. Lett. 
154B, 101 (1985); A. Jackson, A. D. Jackson and V. Pasquier, Nucl. Phys. A432, 567 
(1985); R. Vinh Mau, M. Lacombe, B. Loiseau, W. N. Cottingham and P. Lisboa, Phys. Lett. 
150B, 259 (1985); H. Yabu and K. Ando, Prog. Theor. Phys. 74, 750 (1985) 750; H. Yabu, 
B. Schwesinger and G. Holzwarth, Phys. Lett. B224, 25 (1989). 

[83] H. Weigel, B. Schwesinger and G. Holzwarth, Phys. Lett. B168, 321 (1986); V. B. Kopeliovich 
and B. E. Stern, JETP Lett. 45, 203 (1987); E. Braaten, L. Carson and S. Townsend, Phys. 
Lett. B235, 147 (1990); W. Y. Crutchfield, N. J. Snydernman, V. R. Brown, Phys. Rev. Lett. 
68, 1660 (1992); T. Wainzdoch and J. Wambach, Nucl. Phys. A602, 347 (1996); N. Walet, 
Nucl. Phys. A606, 429 (1996). 

[84] A. D. Jackson and J. J. M. Verbaaxschot, Nucl. Phys. A484, 419 (1988); H. Forkel et al, 
Nucl. Phys. A504, 818 (1989). 

[85] G. Holzwarth and B. Schwesinger, Rep. Prog. Phys. 49, 825 (1986). 

[86] 1. Zahed and G. E. Brown, Phys. Rep. 142, 481 (1986). 




Part C : Formal Methods In QFT 

14. Euclidean Methods In Quantum Field Theory by R.Ramanathan 

15. Topics In Finite Temperature Field Theory by Ashoke Das 

16. Integrable Models And The Toda Lattice Hierarchy by B.M.Sodermark 

17. Perspectives Of Light-Front Quantized Field Theory -Some New Results by Prem P Srivastava 

18. Gauge Symmetry In Chiral Electrodynamics by D.S.Kulshreshtha 

19. Towards A Unified Description Of The Four Interactions In Terms Of Dirac-Bergmann 
Observables by L.Lusanna 




14. Euclidean Methods In Quantum Field Theory 


R. Ramanathan 

Department of Physics and Astrophysics 

Univeristy of Delhi, Delhi -110007 (India) 


Abstract 

In this article, some results from the rich and interesting formulation of Quantum field 
theory in Euclidean space-time are presented with a view to bring out the basic ideas of the 
Euclidean formulation. In particular, its rather surprising relationship with the ’’Stochastic 
Mechanics” of Nelson and the fundamental ” Feynman-Kac” formulae which relate the Green’s 
functions with the corresponding functional integrals, will be emphasized. The last section of 
this article briefly reviews a powerful Computational programme of Paris! and Wu which uses 
the Euclidean and Stochastic processes as a purely auxiliary tool in Quantum field theory, s. 


1 Introduction 

Quantum field theory in Minkowski space is beset with problems of positivity and finiteness of the 
norms involved in various Computations of physical quantities, which require ingenious artifices to 
overcome these pathological problems. One of the many ’’recipes” for avoiding such problems is the 
replacement of the usual description of quantum fields in Minkowski space-time with a description 
in some auxiliary Euclidean space. The resulting theory is often called the Euclidean quantum 
field theory in the literature. 

In the early fifties Wick [1] used an Euclidean method while dealing with the Bethe-Salpeter 
equation. Schwinger [2] was one of the first to point out the feasibility of Euclidean formulation in a 
more comprehensive way as a ’’possible arena for the future development of quantum field theory” 
[3]. The correspondence between the Q.F.T formulated in Minkowski space and the Euclidean 
space can be achieved by making an analytical continuation of the points in Minkowski space of a 
vacuum expectation value of product of quantum fields, to the point of complex-space-time where 
the space coordinates are real and the time coordinate is purely imaginary (Schwinger points). The 
possibility of this analytical continuation is based on the work of Weightman and followers [4], on 
the general structure of quantum field theory, dictated by physical principles. The basic Covariance 
group of the Euclidean theory is the inhomogeneous Euclidean group (rotations and translations) 
unlike the poincare group (rotation, boost and translation) of the Minkowski theory. The structural 
simplification in going from the Minkowski to the Euclidean framework comes from the replacement 
of the indefinite metric of the Lorentz group (with the related hyperbolic problem). The resulting 
Euclidean theory is described by the so-called Schwinger functions, which are essentially wightman 
functions evaluated at Schwinger points. Symanzik [5] realised that Euclidean fields have a well 
defined dynamical structure, and showed that for the case of Bosons they are naturally defined 
as commutative random variables, while their dynamical structure shares a strong analogy with 
classical statistical mechanics. The mathematical foundation of Euclidean field theory was firmly 
laid by Nelson [6], who isolated the crucial Markov property (a Markov process is a stochastic 
process whose transition probability densities are independent of the past and future histories 
of the process, and solely depend on the present) as peculiar to the Euclidean field. He further 
established a Euclidean covariant Feymman-Kac path integral formula and showed the way to 
reconstruct a Wightman theory corresponding to a given Euclidean theory. 

In order the appreciate the basic ideas of Euclidean field theory, let us consider a free scalar 
quantum field ((/>(a:), where x={x^}={x^ = t]x) of mass ‘m', completely defined through its 
vacuum expectation values (Wightman functions or Covariant Green’s functions) [4]. 



374 


R Ramanathan 


2n 


W(xi....X2n) =< 0 I <t>i{xi)...(l>{x 2 n) I 0 >=Y^WiXi,Xj)W{X 2 ...Xj...) (1) 

;=2 

with the expectation of an odd number of fields vanishing. The two-point function is given by : 


W(x, y) = W(x -y) = (2^)-^ exp[-iio(k)(a;o - yo)] (2) 

(8iexp[ik.(x - y)d^k/2m(k) 

where 

ii;(k) = \/((k^ + m^) > m (3) 

If we perform the analytic continuation (wick rotation) on Xot Vo to Schwinger points. 

xo = -ixr, yo = -ij/4i 2:4 > 2/4 (4) 

Then we get the Schwiger function 

S{x, y) = S(x -y) = (27r)"® j exp{-u;(k)(a;4 - 2/4)} (5) 

(8)exp{ik.(x - y)}d^k/2ic;(k) 

where (x,y) are the Euclidean coordinates and (y,2/4)- 

Using the identity 

(l/27r) exp{ik 4 {x 4 - y 4 )}{kl w^{k)''^dk 4 = l/2w{{k) exp{-tt;(k) \ X 4 - Vi |} (6) 

we can rewrite (5) in the form 

S{xj y) = (27r)“'^ exp{ik.(x — 2/)}(fc^ + rri^)~^d^k (7) 

where 

x,y e k.x = ^40:4 + k.k; = fc4 -H k^ ; d^k = dfc4d^k 

Clearly (7) has a validity for all X 4 ,y 4 and has explicit Euclidean covariance in the 4-Euclidean 
space. Exploiting the positive definiteness of (7), Euclidean fields are introduced as Gaussian 
random fields with mean zero and covariance 

.E{<p{x)(f>{y)) = S{x,y) (8) 

Therefore we see that Euclidean theory can be described through stochastic (Commutative) fields 
as opposed to the non-commutative Heisenberg fields of the Minkowstki theory. 

The above illustrative example for the free scalar field can be extended to both free and inter- 
acting vector and spinor fields as discussed in detail in [3]. 

2 Nelson’s Stochastic Mechanics 

There is a deep and surprising connection between the Euclidean field theory formulated in imag- 
inary time and the stochastic mechanics of Nelson [7] , which arose from an interesting derivation 
of Schroedinger equation from a classical, but stochastic dynamical law resembling Newtonian me- 
chanics. Whatever view one may hold about the foundational aspects of this derivation, in sofar 
as it offers an ’objective-realist’ basis for quantum mechanics, it is interesting from both mathe- 
matical and applicational viewpoints. In brief Nelson’s derivation may be summarized as follows. 
The probability density 


p{x,t) = I 'il^ix.t) 


2 


(9) 



Euclidean Methods in QFT 


375 


where is the Schroedinger wave function in satisfying 


obeys the continuity equation : 
where the current j is 



(10) 

( 11 ) 


j = “ h/mpV{Imln'tp) (12) 

It is possible to associate a Markov process to p described by a transition probability density 
p(x, t; x', t') such that 

p(x,t)t,t' = j P{x,t-,x',t')p(:x.',t')dx' (13) 

will be obey a Kolmogorov-Fokker Planck equation 

p=(z;/2)VV^V.(Sp) (14) 

Eqs (12) and (14) are compatible if 

V = h/m (15) 

and 

V^l I 2 + 2m/W.(6| I ^) (16) 

One further condition 

Vx 6 = 0 (17) 

leads to the solution 

b = (fi/m)V[ln| 'ip I +arg'ip] (18) 

We thus have two velocities, the current velocity u defined as 

u = (ft/m) V In I 'ip I (19) 

and the osmotic velocity 

V = {hlm)Vzxgip (20) 

The Schroedinger equation leads to the following equations of motion for u and v : 

it = -(ft/2m)V(V.t7) - V{u,v) ( 21 ) 

{T = ~^{llm)V{u.v) -f 1 / 2 V(u 2 - v^) + {n/2m)V‘^u ( 22 ) 

An equivalent description of the stochastic process is given by the Langevin equation [ 8 ] 

dx = hdt + \J {hl2m)dw (23) 

where dw is the increment of the wiener process described by 

< dwi >= 0; < dwidwj >= 2SijAt (24) 

and the condition that < dwi > are Gaussian mean values which are independent for diflFerent 
times. For the ground state of the Hamiltonian 

H = -{n^/2mV^ + V (25) 

it has no nodes, and we see immediately that v = O 5 the process is stationary (this in fact is true 
for any eigenstate of the Hamiltonian [ 8 ]). 



376 


Jl Ramanathan 


As a simple example let us consider the ground state of the harmonic oscillator [9] 


iPo = (27rcr)~^^'‘ exp(-xV4<T); a = n/i2mw) 

(26) 

which leads to the ’’drift” 


6 = —{hf2m(j)x = —wx 

(27) 

The Kolmgorov-Fokker-Planck (K.F.P.) equation becomes 


p = {h/2m)S/dx^p + wp + wxdpjdx 

(28) 

which has a solution 


p{x,t) = I P{x^,0)pix,x^-,t)dx^ 



with the transition probability 


P{x,x^-,t) = (27ru(t))-i/'exp[l/2a(t))(x - (29) 

where 

a{t) = (1 - (30) 

One thus observes a most surprising result 

< x{(})x{t) >= ae-"!*! (31) 

which looks very ’’Euclidean”. More generally we observe that the ground state process leads to 
the K.F.P. equation 

p = (n/2m)[VV - (Vlnpo)p] (32) 

which has the stationary solution p = /)o In | V’o |^- It turns out that quite generally 

< x(0)x(h) >= (V’ 0 )a:e”'^l‘'a:V’o) (33) 

(33) is a remarkable formula because it links a ’’real time” object on the left-hand-side to an 
’’imaginary time” quantity on the right-hand-side. 

This general connection was noted by Guerra and Ruggiero [9]: ’’Euclidean Quantum Mechanics 
(or Field theory) is the ground state process of Stochastic Mechanics” . The relation (33) is true 
only because the ’’real-time” left-hand-side is not accessible to measurement. We also note that 
the K.F.P. equation has the formal solution 

p(t) = e~'*V(0) = V(0) (34) 

where If = l2m)V‘^ + 76 and b = 2V In | V’o 1 

The above relation suggests a possible application of stochastic mechanics. We have to first 
obtain a trial ground state i/jq and thence 6. We then use this to obtain the ground state energy 
levels using Langevin or Monte Carlo methods [10]. 

3. Euclidean Path integrals and Feynman-Kac Formulae 

As we saw earlier, the Euclidean approach uses the positivity of the Hamiltonian to achieve the 
‘Euclidean’ rotation from the time evolution exp iHt of ‘real time’ quantum mechanics to exp -Ht 
of the ‘imaginary time’ Euclidean field theory. In this procedure the Stochastic processes play a 
purely auxiliary role since they do not take place in ’’real time” .When we have the Hamiltonian 




( 35 ) 



Euclidean Methods in QFT 


377 


where we set h — 2m — 1, The Feynman-Kac formula gives a representation for the integral Kernal 
(with respect to the Lesbesgue measure dx) as follows: 


^ J ‘ V{w{T))dT] (36) 

Here dP^^y(w) is the conditional Wiener measure for Brownian paths starting at x and ending 
in y after time t. The most straightforward way to derive (36) is based on the famous ’’Trotter 
product formula” 

(37) 

where the limit holds in the strong sense. The form of the Feynman-Kac formula found most 
frequently gives instead the solution of the imaginary time Schroedinger equation 

= (V" - V)ij = Hi; (38) 


by 


'i){x^t) = J dP^{w)exp[- J V{x + w{r))dT]i;{x + w{t),0) 


(39) 


where dP'^{w) is now the standard Wiener measure for paths starting at the origin. 

It is now easy to extract the Euclidean path integral from the Feynman-Kac formula for the 
Schwinger functions, for instance 


= LimT-^ooil/Zr) j ^ + j 

xexp[-- / V{w{T)dT]w(i;i),..w{tn) 
Jo 

where T >ti > '>tn>0 and 

jT 

Zt = J J exp[- j V{w]{T))dr] =Tr[e~ 
Equation (40) can be recast formally as 


-TH} 


r ^ 

< x{ti)...x{tn) >= LirriT -►OO (I/Zt) / e-STx{ti)..x{tn)J\dx{t) 

^=0 


(40) 


(41) 


(42) 


where St is the Euclidean action in time T and 

T 

Zt= [ fj dx{t) 

J t =0 

This is so because the conditional Wiener measure corresponds to the expression. 

rT T 

.l/WTexp[-” / u;((r)^(ir JJ dti;(^) (43) 

Jo t=o 

It is well known Gaussian integration techniques that facilitate easy evaluation of the integrals 
involved in the formula for Schwinger functions, that the Euclidean path integrals are far easier 
to handle for any given Euclidean action. In the foregoing, rather brief introduction to Euclidean 
field theory, we have confined ourselves to only the formal aspects. All the results of conventional 
Minkowski fields can be reproduced much more elegantly in Euclidean field theory, for more detail 
on applicational aspects one can refer to Simon [11]. 



378 


R Ramanathaa 


4. Stochastic Field Theory and Euclidean Field Theory 

It is natural to seek extension of Nelson’s stochastic mechanics to systems with infinite degrees of 
freedom, which are fields. We only indicate the directions along which such an extension can be 
achieved without going into the applicational aspects which would entail a voluminous extension 
of the text of this Chapter. As such an elaborate treatment of the Field theoretic extension to 
Nelson’s stochastic mechanics is not on our agenda; we shall only give a brief and introductory 
survey of this area. We shall lay emphasis on two approaches namely Stochastic Field theory 
and Euclidean Field theory which are surprisingly interrelated. As we have seen earlier, one great 
advantage of these approaches is the relative ease in computing the correlation functions because of 
the facility offered by real gaussian integrals as against the complex integrals with operator valued 
functions occurring in the conventional quantum field theory. This is the main positive aspect of 
these approaches to field theory. 

4.1 Stochastic Mechanics of Free Scalar Field 

The first step in the extension of the Nelsonian approach to quantum mechanics to encompass 
fields is the stochastisation of the free scalar field. Consider the free scalar field Hamiltonian in 
natural units (ft = c = 1) 


JT = 1/2 j + mQ(l)‘^]d^x] x £ 
And the equation of motion 




(44) 


(45) 


Let {un{x)}, (n positive integers), be a complete set of basis vectors with box normalisation in a 
finite box of volume V. We then have the normalisation 


/ Un{x)Un' {x)d^x = Snn' 

Jv 


and the completeness relation 


= Sy{x-x') 

n 

and the harmonic oscillator equation 

Going over to the infinite volume limit, we have 


(46) 


(47) 


(48) 


dv{x - x') S{x ~ f') = (27r) ^ J exp[z£(x - ^)]d^k (49) 

Expanding (f>{Xj t) in the form 

oo 

= 'y (50) 

n=0 

where g„(t) axe independent variables, each satisfying a harmonic oscillator equation (The free 
field is an infinite set of haimonic oscillators). Then 

d^qn{t) + wlqnit) =0; wl = ml + (51) 

where the Hamiltonian is given by 

Hn - l/2pl + l/2w\ql 


(52) 



Euclidean Methods in QFT 


379 


By promoting Qn{t) to independent Gaussian Markov processes with averages given by 

< 9n(t) >= 0; < gn(t),gU^') >= <^nn'(2w„)~^ exp[-Wnl t - t' j] (53) 

Consequently from eqn.(50), (i>(x,t) is also raised to a Gaussian Markov process with averages 

< i) >= 0; < (f){x, t)(j)(^) >= ^ u„(x)un(£')(2wn)~^ exp[-u;„| t-t' \] (54) 

n=0 


in a finite box V. In the limit of V utilising (47) and (49) we can write the two-point 

correlation of <p in the form 


< (p{x,t)4i(x' ,t') >= 


j exp[- yjk'^+ml) t-t' \] 


<fikexp{ik.{x — f^)} 
2(27r)^y^fc^ + 


Now consider the stochastic differential equations for qn(t), viz., 


dqn{t) = -Wnqn(i}dt + dwn 


(55) 


(56) 


The noise dwn is normalised as 

dWfidWfif ~ ^nn'dt (^*^) 

If we use (56) in (50) and go to the infinite volume limit we get the stochastic differential equation 
for the field (j>n{x^t) 

d<j){x^ i) = - y/— + mQ(l>{x, t)dt + dw{x, t) ( 58 ) 

Thus the Nelsonian framework can be extended to the free scalar field quantisation (i.e. association 
of a Markovian stochastic process to the quantum state of a dynamical system). In particular the 
ground state process for a free scalar field is found to be a Gaussian Markov field with the two-point 
correlation given by eqn.(55). 


4.2 Stochastic Mechanics of the e-m Field 

The Maxwell field in free space is a mechmical system with dynamical variables given by the 
electric field E{x,t) and the magnetic field B{x,t). The Hamiltonian is 

H = 1/2 J [[E\x,t) + B^{x,t)]d^x (59) 

where the equation of motion are 

dtB = - V X dtE = V xB (60) 

and 

V.B = V J = 0 (61) 

Through the analogical extension of the standard methods of Nelson’s stochastic mechanics, 
we may set up the stochastic differential equations for the Maxwell field by promoting B(x, t) to 
a Gaussian Markov field and E(xx,t), as the drift fields of stochastic mechanics as S(±)(B,x,t), 
such that the following stochastic differential equations are satisfied: 

dB(x,t) = -V xE^±)iB,x,t)dt + dw(x,t) (62) 

(D(±)B)(r,t) = -V X S(±)(JB,x,t) 

VxB = 1/2[J?(+)S(_) + £»(_)£(+)] 



380 


R Ramanathan 


The noise, however, can no longer be pure white noise as in the case of single particle dynamics 
(see earlier chapters), in order to take care of the transversality condition (61) 

dWaix,t)dW 0 {x,t) = (27r)-3 y exp[ifc.(fa; - x')] (63) 

x[fc^da/3 — kakp]d^k 

The above set of stochastic differential equation may be used to compute the two-point cor- 
relation function for the Maxwell field. It has been shown that it indeed yields results which are 
identical with the conventional quantum electro dynamics. As already mentioned, we will not delve 
into the intricacies of this aspect of stochastic field theory as it is beyond the scope of this article 
and we will only urge the interested reader to the copious literature on the subject [9, 11]. 

5. The Parisi-Wu Stochastic Quantisation 

The stochastic quantisation scheme of Parisi and Wu [12] exhibits a Euclidean Quantum field 'ip{x) 
as the stationary limit with respect to a fictitious time r (like the computer time of a Monte- 
Carlo simulation) of the stochastic relaxation process defined for r > 0 by a generalised Langevin 
equation 

dr'^ijx) = -SS{^{rx)]l5'il){r^ x) + 77(r, x) (64) 

Here x = {x^^} , i/ = 0, ...3 ; 5 is the classical Euclidean action and is a Gaussian white noise 
with correlation function. 


< 7j(t,x)tj(t',x') > r , (= 27r(5(T - T')5Hix - x'} (65) 

The Euclidean Green’s functions (Schwinger functions) are obtained as the "equilibrium limit” 
of the correlation functions of the process : 

< ‘>Pixi)...{tp{Xn) >= limr„^ao < ‘‘Pin, Xi)...^(t„, Xn) >n ( 66 ) 

A most interesting aspect of the Parisi-Wu method for continuum field theories is that in gauge 
theories the perturbative calculation of the left hand side of eqn. (66) may be based on the classical 
action S alone, i.e. no gauge-fixing term and associated Faddey-Popov ghosts are necessary. 

This approach thus uses stochasticity as an auxiliary tool, but here the stochastic processes 
take place in the fictitious ‘fifth dimension’ (. To make clear the link between the Feynman-Kac 
formula of the earlier section, we shall see that it leads to Euclidean Functional integrals. The idea 
is now to set up a stochastic process possessing the Euclidean functional measures as its unique 
equilibrium measure. Clearly this does not specify the process uniquely. But the simplest choice 
is based on a Langevin equation like (64). 

We now only have to make the identification 

I ^0 1^ = expl-S/h] (67) 

where Z is a normalizing factor 

b = VS/h (68) 

where ‘V' now really means a functional gradient. At this formal level there is no difference between 
ordinary quantum mechanics and quantum field theory, so we assume that 5 is a functional of some 
Euclidean fields symbolised as 'ip. 

The Kolmogorov-Fokker-Planck equation now reads (with a suitably rescaled time) as 

dtp = h^V^p -f- V.(V5)r/io(69) 


. and the Langevin equation 


d(j) = VSdt -f Mw 


(70) 



Euclidean Methods in QFT 


381 


dw is now a higher dimensional Wiener process, one for each space-time point and field component. 
For a scalar field (p we may take 


< dw{x)dw{y) >= 25{x - y))dt (71) 

We can also determine a potential V belonging to the ground state wave function 

V’O = exp[-l/25'/^] 


It is 

V = + {VSf (72) 

where the ‘V' operators are to be interpreted as functional derivatives. So one can write a 
Feymman-Kac formula leading to a functional integral with one extra dimension. Formally its 
density is given by 

exp[-5/;i]; S = 1/4 J + j Vdr (73) 

The ’’super euclidean” models arising in this way here exhibit interesting supersymmetry as shown 
by Parisi , Sourlas and Gozzi [13]. 

Stochastic quantisation of Parisi- Wu can be used to construct a perturbation expansion [12]. 
It is also possible to use the auxiliary time for regularization by replacing the Wiener process by 
a suitable non-Markovian process; this amounts to replacing J (f>‘^ dr in (73) by {(f)^C~^(j)} where 
C~^ is a suitable operator. In view of applications to gauge theory it is useful to choose ’’local 
in space-time” . 

Our intention in this section was mainly to highlight some essential features of the Parsi-Wu 
scheme which is a powerful computational scheme using stochasticity as an auxiliary tool unlike 
the Euclidean and stochastic approaches to field theory which use stochasticity as a core ingredient 
of the formulation. We also note in passing that the field theories dealt here are purely at zero 
temperature and should be distinguished from finite temperature field theories or the thermo-field 
theories, although in those formulations too. Euclidean structures for the action are postulated ; 
see Article by Ashoke Das in this Book. 


References 

[1] G.C.Wick, Phys. Rev-96 (1954) 1124. 

[2] J.Schwinger, in Proc. Of the 1958 Conf. on High Energy Physics of CERN. Ed. B. Ferretti 
(CERN. Gexeva. 1958). 

[3] B.Simon, the P(()2 Euclidean (Quantum) Field Theory (Princeton N.J. 1974). 

[4] R.Streater and A.S.Wightman, P.C.T, Spin and Statistics, and all that (Benjanim, N.J 1964). 

[5] K.Symanzik, local Quantum Theory .ed. R.Jost (Academic pross, New York, 1969) 

[6] E.Nelson, Journ. Funct. Analysis 12 (1973) 97; 12 (1973) 211. 

[7] E.Nelson, ’’Dynamical theories of Browrian Motion (Princeton University press, Princeton, 
1967). 

[8] E.Nelson, Phys. Rev. 150 (1966) 1079, Also see [7]; R.Ramanathan, Phys. Scripta 34, 365 
(1986). 

[9] F.Guerra, Phys. Rep. 77 (1981) 263. 

[10] G.Jana-Lasinio, ’’Stochastic Process and Quantun Mechanics”, Ecole Poly-technique report, 
June 1983. 



382 


R Ramanathan 


[11] B.Simon, Functional integration in Quantum physics (Academic Press, N.Y. - San Francisco 
- London, 1979). 

[12] G.Parisi and Y.S.Wu, Sci. Sinica, 24 (1981) 483. 

[13] E.Gozzi, Phys. Lett. 130B (1983) 83; Phys. Rev., D28 (1983); G.Parisi and N.Sourlas, Nucl. 
Phys., B206 (1982) 321. 



15. Topics in Finite Temperature Field Theory 


Ashok Das * 

Department of Physics and Astronomy, 
University of Rochester, 
Rochester, New York, 14627 


Abstract 

We discuss a few selected topics in finite temperature field theory. 


1 Introduction 

Studies of physical systems at finite temperature have led, in the past, to many interesting proper- 
ties such as phase transitions, blackbody radiation etc. However, the study of complicated quantum 
mechanical systems at finite temperature has had a systematic development only in the past few 
decades. There are now well developed and well understood formalisms to describe finite tem- 
perature field theories, as they are called. In fact, as we know now, there are three distinct, but 
equivalent formalisms [1-3] to describe such theories and each has its advantages and disadvan- 
tages. But, the important point to note is that we now have a systematic method of calculating 
thermal averages perturbatively in any quantum field theory. 

This, of course, has led to a renewed interest in the study of finite temperature field theories 
for a variety of reasons. We can now study questions such as phase transitions involving symmetry 
restoration in theories with spontaneously broken symmetry [4]. We can study the evolution of 
the universe at early times which clearly is a system at high temperature. More recently, even 
questions such as the chiral symmetry breaking phase transition or the confinement-deconfinement 
phase transition in QCD [5-6] have drawn a lot of attention in view of the planned experiments 
involving heavy ion collisions. This would help us understand properties of the quark-gluon plasma 
better. 

The goal of this article is to share, with the readers, some of the developments in finite temper- 
ature field theories in the recent past and the plan of the article is as follows. In the next section, 
we will describe some basic ideas behind describing a quantum mechanical theory in terms of path 
integrals [7]. This is the approach which generalizes readily to the study of finite temperature field 
theory. In section 3, we will discuss one of the formalisms, in fact, the oldest one, of describing 
finite temperature field theory. This goes under the name of the imaginary time formalism or 
the Matsubara formalism [1, 5, 8-10]. In this description, the dynamical time is traded in for the 
temperature. In contrast, the real time formalisms of finite temperature field theory contain both 
time and temperature. In section 4, we discuss one of the real time formalisms known as thermo 
field dynamics [3, 10-11]. This is an ideal description to understand operator related issues involv- 
ing finite temperature field theories although it has a path integral representation which is quite 
nice for calculations as well. The other real time formalism, which is much older and is known as 
the closed time path formalism [2, 10, 12], is described in section 5. This formalism is very nice 
because it describes both equilibrium and non-equilibrium phenomena, at finite temperature, with 
equal ease. Temperature leads to many subtle features in field theories. In section 6, we discuss 
one such subtlety, namely, how one needs a generalization of the Feynman combination formula 
to perform calculations at finite temperature [13]. In section 7 , the issue of large gauge invariance 


* Email: das@hep.pas.roch€ster.e.du 



384 


Ashok Das 


is discussed within the context of a simple quantum mechanical model [14-15]. In section 8, we 
discuss in some detail how temperature can lead to breaking of some symmetries like supersym- 
metry [16] (Temperature normally has the effect of restoring symmetries). Finally, we present a 
brief conclusion in section 9. The subject of finite temperature field theories is quite technical 
and to keep the contents simple, we have chosen, wherever possible, simple, quantum mechanical 
models to bring out the relevant ideas. Finally, we would like to note that there are many works 
in the literature and the references, at the end, are only representative and are not meant to be 
exhaustive in any way. 


2 Path Integrals at Zero Temperature 

In studying a quantum mechanical system or a system described by a quantum field theory, we 
are basically interested in determining the time evolution operator. In the standard framework of 
quantum mechanics, one solves the Schrodinger equation to determine the energy eigenvalues and 
eigenstates simply because the time evolution operator is related to the Hamiltonian. There is an 
alternate method for evaluating the matrix elements of the time evolution operator which is useful 
in studying extremely complicated physical systems- This goes under the name of path integral 
formalism [7, 17-18]. 

In stead of trying to develop the ideas of the path integral formalism here, let us simply note 
that, for a bosonic system described by a time independent quantum mechanical Hamiltonian, the 
transition amplitude can be represented as (The subscript H denotes the Heisenberg picture.) 

( 1 ) 

There are several comments in order. First, the transition amplitude is nothing other than the 
matrix element of the time evolution operator in the coordinate basis. Second, the integral on the 
right hand side is known as a path integral. It is an integral over all possible paths connecting 
the initial coordinate Xi and the final coordinate Xf which are held fixed. The simplest way to 
evaluate such an integral is to divide the time interval of the path between xi and Xf into N 
intervals of equal length. Integrating over all possible values of the coordinates of the intermediate 
points (which are ordinary integrals) and taking iV — oo such that the time interval is held fixed 
is equivalent to integrating over all possible paths. Finally, the action S[x] in the exponent of the 
integrand is nothing other than the classical action for the bosonic system under study. This is 
true for most conventional physical systems where the Hamiltonian depends quadratically on the 
momentum. If this is not the case (and there are some cases where it is not), the right hand side 
of (1) needs to be modified. However, for most systems that we will discuss, we do not have to 
worry about this fine point. 

The advantage of the path integral is that while the left hand side involves quantum mechanical 
operators, the right hand side is described only in terms of classical variables and, therefore, the 
manipulations become quite trivial. Furthermore, the transition amplitude defined in eq. (1) can 
be generalized easily to incorporate sources and this allows us to derive various Greens functions 
of the theory in a very simple and straightforward manner. As an example, let us simply note here 
that for a harmonic oscillator, the action is quadratic in the dynamical variables, namely, 

1 . o 1 2 9 

— -mujx 

2 2 J 

and, in this case, the path integral can be exactly evaluated and has the form [7] 

{xf\e‘~^^'^\xi) = J 


2'KiTi sin ujT J 




( 2 ) 



i 


Finite Temperature Field Theory 


385 


Here, we have defined T — tf — ti. S[xci] represents the action cissociated with the classical 
trajectory (satisfying the Euler-Lagrange equation) and has the form 

= 2sina^ coswT - 2xiXf] (3) 

The path integrals can also be extended to quantum mechanical systems describing fermionic 
particles. However, one immediately recognizes that there are no classical variables which are 
fermionic. Therefore, in order to have a path integral description of such systems in terms of 
classical ^variables, we must supplement our usual notions of classical variables with anti-commuting 
Grassm^ann variables [19]. With this, for example, we can write a classical action for the fermionic 
oscillat.br as 

5[^, ^] = / dt {i'ip'ip - u'lp'ijj) (4) 

Jti 

Here '0 and -0 are anti-commuting Grassmann variables and in the quantum theory, as operators, 
can be identified with the fermionic annihilation and creation operators respectively. The action 
in eq. (4) is also quadratic in the variables much like the bosonic oscillator and the path integral 
for the fermionic oscillator can also be exactly evaluated giving [7] 

= (5) 

/ In a quantum field theory, we are often interested in evaluating time ordered correlation func- 
tions in the vacuum because the S-matrix elements can be obtained from such Greens functions. 
These can be derived in a natural manner from what is known as the vacuum to vacuum transition 
functional which can be obtained from the transition amplitude in eq. (1) in a simple manner and 
also has a path integral representation of the form 


lim (0|e = J 


( 6 ) 


where 


/ CO 

dtL{x^x) 

-CO 


(7) 


Furthermore, the path integral in eq. (6) has no end-point restriction unlike in eq. (1). This 
vacuum to vacuum transition amplitude is also commonly denoted by (0|0) with the limiting 
process understood. We note here that an analogous formula also holds for fermionic systems. 
The vacuum to vacuum amplitude in the presence of a source has the form 


where 


Z[J] = {0|0)j = 


/ OO 

dtJ{t)x{t) 

-CO 


( 8 ) 

(9) 


Here J{t) is a classical source and it can be easily checked that, in the limit of vanishing source, 
the functional derivatives of Z[J] give rise to time ordered Greens functions in the vacuum. 

With this very brief review of the path integral description for zero temperature quantum 
mechanical theories, we are now ready to describe the different formalisms available to study 
quantum mechanical systems at finite temperature. 



386 


Ashok Das 


3 Imaginary Time Formalism 

The properties of a quantum mechanical system, at finite temperature, can also be given a path 
integral description. There are various, but equivalent ways of doing this. Of the different for- 
malisms available to study a quantum mechanical system at finite temperature, the imaginary time 
formahsm is the oldest [1]. To appreciate this, let us recall some of the features of a statistical 
ensemble. A statistical ensemble in equilibrium at a finite temperature ^ (in units of Boltzmann 
constant) is described in terms of a partition function 

Zif3) = Trp{l3) = Tte->^'^ ^ (10) 

Here p{/3) is known as the density matrix (operator) and H can be thought of as the genetalized 
Hamiltonian of the system. If 

n = H 

where H is the Hamiltonian of the system, we say that the ensemble is a canonical ensemble where 
the particle number is fixed and the system is allowed to exchange only energy with a heat bath. 
On the other hand, if 


where N is the number operator, then, the ensemble is known as a grand canonical ensemble where 
the system can exchange not only energy with a heat bath, but can also exchange particles with a 
reservoir. The constant (jl is known as the chemical potential. In a statistical ensemble, of course, 
the important observables are the ensemble averages and, for any observable O, they are defined 
as 

{O)0 = -^Tr:pifi)O ( 11 ) 

Let us also note here that since the partition function involves a trace, it leads to an interesting 
identity following from the cyclicity of the trace, namely, (we will assume from now on, unless 
otherwise specified, that = 1) 


Such a relation is known as the KMS (Kubo-Martin-Schwinger) [20] relation which generalizes to 
all statistical ensemble averages and plays a crucial role in the study of finite temperature field 
theories. 

It was observed quite early by Bloch [21] that the operator in the definition of the partition 
function is like the time evolution operator in the imaginary time axis. This is really at the heart 
of the imaginary time formalism. In fact, let us note that the canonical partition function can be 
written as (with the trace taken in the coordinate basis) 

Z(/3) = j dx{x\e-^^\x) (13) 

It is clear now that if we identify T = — iyd in eq. (1), then, we can give the partition function a 
path integral representation as (h = 1) 

Zip) = f 


m 

1 

m 

1 


Tre->^'^Oi{t)02it') 

Tte-^^C?2(f')Oi(i + *;3) 


m 

{O2it')Or{t + il3))0 


( 12 ) 


(14) 



Finite Temperature Field Theory 


387 


where 5£;[x] is the Euclidean 
interval as 


(imaginary time) action for the system defined over a finite time 



13 . 

dtLE{x^ x) 


(15) 


Furthermore, it is clear from eq. (13) that the variable x must satisfy the periodic boundary 
condition 


x(l3) = a;(0) 


(16) 


for eq. - (14) to represent a trace (namely, the initial and the final states must be the same) and 
that the end point is being integrated over in the path integral in eq. (14) unlike in eq. (1). (It is 
important to note that the original work of Matsubara is an operator description of the imaginary 
time, but we will not discuss it in the present article.) 

In fact, as an example, let us evaluate the canonical partition function for the bosonic oscillator 
using this formalism [7]. The transition amplitude is already given for zero temperature in eq. (2). 
Now making the identifications 


T = -i/3, Xi = Xf = X 


(17) 


we obtain from eqs. (2) and (13) 

m = 


This is, indeed, the partition function for the bosonic oscillator as can be directly verified. 

The partition function, for a fermionic system, can also be similarly given a path integral 
representation. However, the anti-commuting nature of the fermion variables introduces one crucial 
difference, namely, for a fermion theory, we have 

Z{P) = (19) 

with anti-periodic boundary conditions [10] 

= -^( 0 ), m ( 20 ) 


/ 


dx ( — — — ^ 
\27r sinh /3a; J 


-(muitanh 


moj 


2'ir sinh jSoj 




TT 


^^mo; tanh 


ei3u; _ 1 


(18) 


The Euclidean (imaginary time) action is again defined over a finite time interval as in eq. (15). 
In fact, let us calculate the canonical partition function for a fermionic oscillator, as an example, 
from the result in eq. (5) as well as the identifications in (20) [7]. Using 

= --'ip 


we obtain (remember T = 

Z{0) = 

= (1 + e-^") = 2 cosh ^ (21) 

In evaluating this, we have made use of the Berezin rules of integration [19] for Grassmann variables 
and we note that eq. (21), indeed, gives the correct partition function for a fermionic oscillator as 
can be directly calculated. 



388 


Ashok Das 


Although our discussion so far has been within the context of simple quantum mechanical 
systems/everything we have said can be carried over to a quantum field theory. The partition 
function for a quantum field theory can again be written as a path integral involving a Euclidean 
action as 

Z{/3) = / (22) 

where the Euclidean action is defined over a finite time interval and the fields satisfy the periodicity 
(anti-periodicity) conditions 

( 23 ) 


and so on. The discussion is slightly more involved for gauge theories and to keep things simple, 
we will not discuss gauge theories. 

This formulation of afield theory at finite temperature is known as the imaginary time formalism 
or the Matsubara formalism [1] and is the oldest formalism. There are several distinguishing 
features of this formalism. For example, since the time interval is finite, Fourier transformation 
of the time variable would involve discrete energies. In other words, the Fourier transform of the 
propagator, say for example, at finite temperature in the imaginary time formalism, would take 
the general form 

Gpir, ^) = 4 X! (24) 

^ n 

where tOn — ^ with n = 0, drl, ±2, • • •. However, from the definition of the time ordered product 

Tr((/>(r)(?l>^(r')) = 0(r - r')<^(r)0^(r') ±9{t' — r)(j)^ {T')(f>{T) (25) 

where we have allowed for both bosonic and fermionic fields and the KMS condition in eq. (12), . 
it follows that, for r < 0, 

G(3 (t, x) = ±00 (t 4- /3, f ) ) (26) 

It is important to recognize that the periodicity (anti-periodicity) of the propagator arises from the 
definition of the time ordered product for the bosonic (fermionic) fields and the KMS condition and 
is not directly connected with the periodicity (anti-periodicity) of the corresponding field variables 
which we have discussed earlier. This periodicity (anti-periodicity) of the propagator, on the other 
hand, leads to the restriction that eq. (24) holds with 


= 


^ for bosons 
for fermions 


(27) 


where n = 0, ±1, • • •, These are conventionally known as the Matsubara frequencies [22]. 

Given this, one can now calculate the propagators for bosonic and fermionic field theories in 
the Matsubara formalism and they take the forms (in the momentum space) 




4- 


-h 4- m2 


(28) 


S0{tOn,k) 


+7 • m _ -h 7 • ^ -b m 

L0l±P±m^ ^ Yi2n^)2 ^ p + ^2 ■ 


(29) 


Perturbative calculations can now be developed quite analogously to the zero temperature field 
theory. For example, given a field theory, we can read out the vertices from the Euclidean form of 
the action and use the propagators of eq, (28, 29) to carry out a diagrammatic calculation which 
would lead to the ensemble average for a given observable. It is clear that, because the time interval 
is finite in this formalism, the coordinate space calculation of any diagram is cumbersome. However, 
much like at zero temperature, the momentum space calculation is much simpler. However, one 



Finite Temperature Field Theory 


389 


should keep the difference in mind, namely, that, at finite temperature, the external and the 
internal energies are discrete as in eq. (27). Consequently, the integration over internal energies 
(of zero temperature) is replaced by a sum over the internal energies. More specifically, we must 
use 

d^k 1 V- /* 


/ 




As an example, let us consider the self-interacting scalar theory described by 

A>0 




(31) 


We note that the only one loop correction in this theory is the mass correction. Rotating to 
Euclidean space and using the propagator for a scalar theory as given in eq. (28) as well as (30), 
we obtain the one loop mass correction to be 


Am^ = 


A \ — \ f k 


(27r)3 + 


P \ ^ f d^k 


(32) 


2P \27tJ (27r)3 n^ + (^y 

Here, we have introduced the notation, 

LJk = {k^+m^)i (33) 

The sum, in eq. (32), can be easily evaluated using the method of residues leading to 


oo 1 ^ 

52 “’T "; — 2 = ■” ^othny for y > 0 

ri^+y2 y 


(34) 


Using this, the one loop mass correction can be determined to be [23] 

<fk 1 


Am^ 


= 1/ 

-II 


1 


coth 


(27r)3 

d^k 
(27r)2 2 


P^k 

2 

d^k 


r d^k 1 

J (27r)3 cjfc 


e/3^k ^ 1 


Amo + 


(35) 


There are several things to note from this calculation. First, the mciss correction separates 
into two parts - one independent of temperature and the other genuinely a finite temperature 
correction. The temperature independent part (zero temperature part) is divergent as is expected 
at zero temperature and the divergence has to be handled by the usual process of renormalization. 
However, the finite temperature part is completely free from ultraviolet divergence. This is a general 
feature of finite temperature field theories that temperature does not introduce any new ultraviolet 
divergence. We will return to this question later within the context of real time formalisms for finite 
temperature field theories. Let us also note that (see (35)) the finite temperature integrals are, in 
general hard to evaluate and cannot be evaluated in a closed form. However, we can always make 
a high temperature expansion (small P) which would give the temperature dependent correction 
to the mass as 

A \T^ 


Ami 


24/32 24 


(36) 


This shows that temperature induces a mass correction which is positive. Intuitively, it is clear 
that this is the behavior we would expect from a particle moving in a medium and, furthermore, 
the positivity of this correction is crucial in the study of symmetry restoration in field theories 
with spontaneous symmetry breaking. 




390 


Ashok Das 


This gives a flavor of calculations at finite temperature, particularly, in the imaginary time 
(Matsubara) formalism. It is worth noting here that, by construction, the imaginary time formalism 
would describe physical systems in equilibrium quite well. Since we have traded the time variable 
for temperature, it is well suited to calculate static, equilibrium quantities. Slow temperature 
dependence can, however, be brought in by analytically rotating the final result to Minkowski time 
[24]. This rotation is, on the other hand, nontrivial since we only have information about quantities 
at discrete energy values in the Euclidean space. The imaginary time formalism is not suitable to 
discuss non-equilibrium phenomena. 


4 Thermo Field Dynamics 

As we have seen, in the imaginary time formalism, the time variable is traded for the temperature. 
However, in studying various processes, it is desirable to have the time coordinate in addition to 
the temperature. Formalisms where this can be achieved are known as the real time formalisms 
and there are two distinct, but equivalent such formalisms. In this section, we will discuss the 
formalism of thermo field dynamics [3, 11, 25] returning to the alternate formalism in the next 
section. 

Let us recall from (11) that the ensemble average of any observable is given by 

Here, we have assumed that the eigenvalues of Ti are discrete, for simplicity, and that 

n\n) = En\n) 

(m|n) = Smn 

5^|n.)(nl = 1 (38) 

n 

At zero temperature, we know that the Feynman diagrams correspond to vacuum expectation 
values of time ordered products. Thus, intuitively, it is clear that if we can express the ensemble 
averages as expectation values in some vacuum (say, a thermal vacuum), then, we can take over 
all the diagrammatic machinery of the zero temperature field theory. The question, therefore, is 
whether we can define a vacuum, say |0, /3), such that we can write any ensemble average as 

{ 0)0 = (0,/3|O|0,^> = ^ (39) 

Let us suppose that we can define such a thermal vacuum state as a linear superposition of the 
states in our physical Hilbert space, namely, 

l0,/9) = Y, N(nlO,/?) = 2;/„(/3)|n) (40) 

n n 

This would lead to 

= Y.^nmm{P){n\0\m) (41) 

tlflTl 

Consequently, this would coincide with eq. (39) only if 

/:(/3)/m(i3) = ^e-^^"<5, 


(42) 



Finite Temperature Field Theory 


391 


Since f^s are ordinary numbers and eq. (42) is more like an orthonormality condition, it is clear 
that we cannot satisfy this condition (and, therefore, define a thermal vacuum with the right 
properties) if we restrict ourselves to the original Hilbert space. 

On the other hand, it is also clear from this analysis that if fnS, somehow, behave like a state 
vector, then, the condition in eq. (42) can be easily satisfied. In fact, let us introduce a fictitious 
system identical to our original system and denote it by a tilde system. The states in the combined 
Hilbert space of this doubled system would have the form 

[n, m) = |n) ig) \m) 

Let us assume that the thermal vacuum can be written as a linear superposition of states in this 
doubled Hilbert space of the form 

|0,/?) = X^/n(/3)|n,ra) = '^fni^)\n) (8» |n) (43) 

n n 

This would lead to 

n,m 

= H {n\0\m) 5n,m 

n,m 

= J2m)MlS){n\0\n) (44) 

n 

In deriving this result, we have used the fact that an operator of the original system does not 
act on states of the tilde system and vice versa. The result in eq. (44) is quite interesting because 
it says that if we choose 

p—PEfi 

m)m) = ^ or, /„(/?) = jim = (45) 

then, eq. (44) would, indeed, coincide with the ensemble average in eq. (39). 

This analysis shows that it is possible to introduce a thermal vacuum such that the ensemble 
average of any operator can be written as the expectation value of the operator in the thermal 
vacuum. The price one has to pay is that the Hilbert space needs to be doubled. The advantage, 
on the other hand, lies in the fact that the description would now involve both time and temper- 
ature (since we have not traded time for temperature) and all the diagrammatic methods of zero 
temperature field theory can now be taken over directly. 

Fermionic Oscillator 

To get a flavor for things in this formalism, let us analyze in some detail the simple quantum 
mechanical system of the fermionic oscillator. The Hamiltonian for the system is given by {h = 1) 

H = uja^a (46) 

Here, the fermionic creation and annihilation operators satisfy the canonical anti-commutation 

relations 

[0.0^]+ = 1 

[a,a]+ = [at,at]^=0 (47) 

In this case, the spectrum of the Hamiltonian is quite simple and the Hilbert space is two dimen- 
sional with the basis states given by |0) and |1) = at|o). 



392 


Ashok Das 


According to the general philosophy of thermo field dynamics, we are supposed to introduce a 
fictitious tilde system which is identical to our original system. Thus, we define 

H = ua^a (48) 


with the anti-commutation relations 


[a,dt]_j_ = 1 

[o,a]^ = [d^at]_|_=0 (49) 

Furthermore, we assume that the creation and the annihilation operators for the tilde and the 
non-tilde systems anti-commute. 

The Hilbert space for the combined space is now four dimensional and, following our earlier 
discussion, we choose the thermal vacuum to be ^ 


|0„8) = Mm) ® |6) + Mm) ® li> (50) 

The normalization of the thermal vacuum gives 

(0,m/3) = l/o(/3)p + |/i(^)|' = l (51) 

while the expectation value of the number operator gives 

(0,/3|iV|0,/3) = (0,/3|ata|0,^) = |A(/?)|^ = (52) 

^Prom these, we can obtain 


/o(/3) = 


1 

n/1 -I- ’ 




g-/3w/2 

vTTe^ 


(53) 


so that we can write 

To further understand the properties of this system, let us note that we can define a Hermitian 
operator in this doubled space 

G{e) = -i6{^) (oa - o+dt) (55) 

This would, in turn, lead to a formally unitary operator 


C/(/3) = 


(56) 


which would connect the thermal vacuum to the vacuum of the doubled space, namely, 
17(/3)|0,0) = cos0(,3)|O,O) -l-sin0(/3)ll,i) = |0,/3) 

provided 


Q,os9{fi) = M^) = 


sin6'(/9) = /i(/3) = 


q-Puj/2 


VTT7-/5-’ vTTF^ 

The unitary operator would also induce a transformation on the operators of the form 


0(/3) = U{I3)OUH0) 


(57) 

(58) 


(59) 


= cos0(,d) a — sin0(/3) d^ 
= cos 0(/3) d -t- sin d{/3) 


In particular, this would give 


am 

am 


(60) 



Finite Temperature Field Theory 


393 


as well as their Hermitian conjugates. These operators would satisfy the same anti-commutation 
relations as the original ones and we can think of them as the thermal creation and annihilation 
operators. Consequently, we can build up the thermal Hilbert space starting from |0, /?) and the 
thermal creation operators. 

In particular, it is trivial to check, using (57) that the thermal vacuum satisfies 

a(/3)|0,/3) = (cos0(y3)a - sin^(/3)d^)|0, ,5) =0 

aim.P) = (cose(/?)a + sin^(/3)at)|0,/3}=0 (61) 

This is quite interesting for it says that annihilating a particle in the thermal vacuum is equivalent 
to creating a tilde particle and vice versa. Consequently, we can intuitively think of the tilde 
particles as kind of hole states of the particles or particle states of the heat bath. This gives a nice 
intuitive meaning to the doubling of the degrees of freedom in thermo field dynamics. Namely, an 
isolated system in thermal equilibrium really consists of two components - the original system and 
the heat bath. 

We also note here that although the operator connecting the thermal vacuum to the vacuum 
in the doubled space is formally unitary, it is more like a Bogoliubov transformation. In more 
complicated models with an infinite number of degrees of freedom (namely, in field theories) such 
an operator takes us to a unitarily inequivalent Hilbert space. Let us also note here, for future use, 
the simple formula following from eq. (60) that 

(at(ffl)=W>(a“) <“> 

where 

^..(coseiP) -smei0)\ , . 

cosdifi) ) 

Finally, let us conclude the discussion of this^ example by noting that the states in the thermal 
Hilbert space are eigenstates of neither H nor H, Rather they are the eigenstates of the operator 

H = H-H (64) 

Furthermore, this combination of the Hamiltonians is also invariant under the unitary transfor- 
mation of (57). This is, indeed, the Hamiltonian that governs the dynamics of the combined 
system. 

Bosonic Oscillator 

The analysis for the case of the bosonic oscillator is quite analogous to the discussion of the 
fermionic oscillator. Therefore, without going into too much detail, let us summarize the results. 
First, the Hamiltonian for the system is given by 

H^uja^a (65) 

much like the fermionic oscillator. However, the creation and annihilation operators satisfy canon- 
ical commutation relations of the form 

[a, d^] = 1 

[a, a] = [a+,at]=0 (66) 

The Hilbert space for the bosonic oscillator is infinite dimensional with the energy eigenstates given 
by 

jEf|n) = no; |n), n = 0,1,2,* •• (67) 

According to the general discussions of thermo field dynamics, we introduce an identical, but 
fictitious tilde system with the Hamiltonian 


H = uja^a 


( 68 ) 



394 


Asbok Das 


The tilde creation and annihilation operators are expected to satisfy commutation relations analo- 
gous to (66). Furthermore, the tilde operators are supposed to commute with the original operators 
of the theory. 

Following the discussion of the earlier section, we can determine the thermal vacuum state in 
this case to be „ 

C50 

|0,/3) = (1 - fi) (69) 

n=0 

As in the fermionic oscillator, we can introduce the Hermitian operator 

G{e) = -i9i/3) (aa - 0 ^ 0 ^) (70) 

and the unitary operator 

Uil3) = (71) 

Then, it is straightforward to check and see that the unitary operator connects the thermal vacuum 
to the vacuum of the doubled space provided 

The unitary operator induces a transformation of the operators of the form 

Oi^) = UmOUH^) (73) 


leading to 


a{P) = cosh 9{p)a — sinh 0(/3) 
d(/3) = cosh 9{(i) a — sinh 0(/3) o) 

and similarly for the Hermitian conjugates. As we have seen in the last section, these can be 
thought of as the creation and annihilation operators for the thermal Hilbert space. In particular, 
the thermal vacuum is easily seen to satisfy 

a(/3)|0,/3) = (cosh0(/?)a-sinh0(/?)a+)|O,/3> =0 

^(/5)|0)/5) = (cosh0(^)a - sinh0(/3)a^)|O,/3) = 0 (74) 

This, again, reinforces the intuitive picture of doubling in thermo field dynamics. Let us also note 
here, for future use, the simple formula following from (73) 

(a“m)=^w(a“) 

where 

fj(R\^ ( cosh0()5) — sinh0(/3) A 

V -sinh0(/?) cosh0(/3) ; 

Field Theory 

The extension of these results to a field theory is quite straightforward once we keep in mind 
that, at the free level, a quantum field theory is simply an infinite collection of oscillators with 
frequencies dependent on the momentum of the mode. Consequently, the thermal vacuum, in this 
case, would be connected to the vacuum of the doubled space as 

|0,j3) = t7(/3)|0,0) = |0, 0) (77) 

G{9) = -f ^ %(^) (ajjag - at^) 


where 


(78) 



Finite Temperature Field Theory 


395 


with, say, for bosons. 


cosh0^(/3) = 



Here, for a relativistic theory, we have 


sinh^j(/?) = 


Vl - 


(79) 








Let us next note that, at zero temperature, the original fields are decoupled from the tilde 
fields. Thus, if we were to define a doublet of fields (real scalar field) as in eq. (75) 


§ = 


(80) 


then, at zero temperature the propagator is defined to be (This is not to be confused with the 
generator of the Bogoliubov transformations in eqs. (55), (70) and (78)) 

iG(z;-2/) = (0,0|r($(x)$(2/))|0,0) 

which has the momentum space representation 


G{k) = ( 


0 

1 


(81) 


Given this, the finite temperature propagator can be determined to be 
iG0{x-y) = {O,l3\TmxMy))\O,0) 


(82) 


Using now the generalization of eqs. (73,75,76), the momentum representation for the propagator 
can be determined to be 


Gffik) = u{-e^)G{k)m{-e^) 


0 


0 

1 


— 2iT:nB{\k°\)5{k^ — m^) 


1 e0\k°\f'i 

Qh'\n. I/A 


p0\k^\/2 


(83) 


There are several things to note from the structure of the propagator which are quite general 
for a real time formalism. First, the propagator is a 2 x 2 matrix, a consequence of the doubling 
of the degrees of freedom. Second, the propagator is a sum of two parts - one representing the 
zero temperature part and the other representing the true temperature dependent corrections. 
The propagator is still the Greens function for the free operator of the theory, but corresponding 
to different boundary conditions (remember the KMS condition in eq. (12)). While the zero 
temperature part of the propagator corresponds, as usual, to the exchange of a virtual particle, the 
temperature dependent part represents an on-shell contribution (because of the delta function). In 
fact, the intuitive meaning of the temperature dependent correction is quite clear. In a hot medium, 
there is a distribution of real particles and the temperature dependent part merely represents the 
possibility that a particle, in addition to having virtual exchanges, can also emit or absorb a real 
particle of the medium. 

Since the temperature dependent part of the propagator is on-shell, it is clear that there can 
be no new ultraviolet divergence generated at finite temperature. All the counter terms needed 
to renormalize the theory at zero temperature would be sufficient for studies at finite temperature 
as well. (Of course, the infrared behavior is another story. Infrared divergence, in a field theory, 
becomes much more severe at finite temperature, a topic that I will not get into.) There is an 
alternate way to visualize this. At finite temperature the distribution of the real particles is 



396 


Ashok Das 


Boltzmann suppressed as we go up in energy and, consequently, thermal corrections corresponding 
to infinite energy cannot arise. 

Once, we have the propagator, we can venture to do a diagrammatic calculation in this for- 
malism. The only things missing are the interaction vertices of the theory. There is a well defined 
procedure [26] (called the tilde conjugation rule) to construct the complete Lagrangian from which 
to construct the vertices. Very simply, it corresponds to what we have noted earlier, namely, the 
dynamical Hamiltonian (and, similarly, the Lagrangian) is as given in eq. (64). It is simply the 
difference between the original and the tilde Hamiltonians. Thus, we see that the complete theory 
would contain two kinds of vertices ~ one for the original fields while the second for the tilde fields. 
The vertices for the tilde fields will have a relative negative sign corresponding to the original 
vertices. Given the vertices and the propagator, it is now straightforward to carry out any dia- 
grammatic calculation to any order. Let me emphasize here that although, at the tree level, there 
is no vertex containing both the original and the tilde fields, such vertices would be generated at 
higher loops because of the nontrivial matrix structure of the propagator. 

Thermo Field dynamics is a real time formalism. But, more than that, it is really an operator 
formalism and hence very well suited to study various operator questions such as the structure of 
the thermal vacuum, the theorems on symmetry breaking etc. It can also be given a path integral 
representation and corresponds to choosing a specific time contour in the complex t plane [27, 
10] (remember that the imaginary time formalism also corresponds to choosing a specific time 
contour, namely, along the imaginary time axis) and I will come back to this question in the next 
section. However, once again from the philosophy of thermo field dynamics, it is clear that, it 
is a natural formalism to describe equilibrium phenomena where quantities depend on both time 
and temperature. While there are several attempts to generalize this to include non-equilibrium 
phenomena, there does not yet exist a complete description. 


5 Closed Time Path Formalism 

The closed time path formalism is also a real time formalism which was formulated much earlier 
than thermo field dynamics within the context of non-equilibrium phenomena [2]. The two for- 
malisms are, in some sense, complementary to each other although the closed time path formalism 
can describe both equilibrium and non-equilibrium phenomena with equal ease. 

The basic idea behind the closed time path formalism [10] is the fact that when a quantum 
mechanical system is in a mixed state, as is the case in the presence of a heat bath, the system 
can be naturally described in terms of a density matrix defined, in the Schrodinger picture, as 

pit) = Y^Pn l^nWXV'n W| (84) 

n 

Here, pn represents the probability for finding the quantum mechanical system in the state |V^n(0) 
and, for simplicity, we have assumed the quantum mechanical states to form a discrete set. It is 
Pn which contains information regarding the surrounding which is hard to determine, but, being a 
probability, it satisfies 

n 

Given the density matrix, the ensemble average of any operator can be calculated in the 
Schrodinger picture as 

{0)(t) = YPn {Mt)\0\Mt)) = T^p{t) O (85) 

n 

The ensemble average, in this case, naturally develops a time dependence from the time dependence 
of the density matrix. In this formalism, we can naturally define an entropy as 

s=-Yp^ 



Finite Temperature Field Theory 


397 


which is by definition positive semi-definite and measures the order (or lack of it) in an ensemble. 
The state vectors satisfy the Schrodinger equation (fi = 1) 


, d\m) 

* dt 


= H\m) 


<[,From this, we can determine the time evolution of the density matrix which turns out to be the 
Liouville equation 

i^ = [H,p{t)] ( 86 ) 

In deriving this, we have assumed that the probabilities do not change with time (appreciably) 
implying that entropy remains constant during such an evolution. The reason for this assumption 
is our lack of knowledge about the time evolution of the surrounding such as the heat bath. On 
the other hand, adiabatic evolutions do arise frequently in physical systems and, consequently, we 
would continue with this assumption. 

Let us note that eq. (86) has a simple solution of the form 


p{t) = Uit,0)p{0)U\t,0) = U{t,0)p{0)U{0,t) 
where the time evolution operator has the general form 


(87) 


( 88 ) 


Furthermore, it satisfies the semi-group properties 
U(ti,t2)U{t2^ti) = 1 

U{ti^t2)U{t2,tz) = U[ti,tz) ioxti>t2>tz (89) 


In particular, let us note that if the Hamiltonian is time independent, eq. (87) takes the simple 
form 

p{t) = e-*^V(0)e‘"‘ 

and, furthermore, if the Hamiltonian commutes with p(0), the density matrix would be time 
independent, describing a system in equilibrium. This would be true, for example, if the states in 
eq. (84) are stationary states. This is also true if the probabilities have a Boltzmann distribution 
in which case, we refer to the system as being in thermal equilibrium. However, we will not 
restrict to any such special case allowing for the formalism to accommodate both equilibrium and 
non-equilibrium phenomena. 

Keeping in mind the fact that we are ultimately interested in a thermal ensemble, let us choose 

p-m 

/>(«) = (“) 

for some Hi. Since the density matrix is a positive Hermitian matrix with unit trace, mathemat- 
ically, this is allowed. But, more important is the physical reason behind such a choice. Namely, 
we can think of the dynamical Hamiltonian of our system as 



Hi for Ret < 0 
Hit) for Ret > 0 


(91) 


This would correspond to the fact that we prepare our system in a equilibrium state at temperature 
^ for negative times and let the system evolve, for positive times, with the true Hamiltonian H 
which may be time dependent. If 'Hit) = Hi, then, the system will evolve in equilibrium and not 
otherwise. 

With eq. (91) in mind, we note that we can write 


p(0) = 


UiT-i^,T) 

TrUiT-il3,T) 


(92) 



398 


Ashok Das 


where T is assumed to be a large negative time (and not the temperature) and T -oo at the 
end. Using the semi-group properties of the time evolution operator, it is easy to see that the 
ensemble average of any operator can now be represented as 


{ 0)0 = Tr Pit) O 

Tr Uit, 0) UiT - i0, T) UiO, t) O 
TiU{T-il3,T) 

TkU[T- i/3, T) U{T, T) U{T', t) O Uit, T) 
TrUiT -il3,T)UiT,T')UiT',T) 


where we have introduced a large positive time T' and assume that T' -4 oo at the end. This gives 
a nice representation to the ensemble average of any operator. Namely, we. let the system evolve 
from a large negative time T to t where the appropriate operator O is inserted. The system then, 
evolves from t to a large positive time T' and back from T' to T and then, continues evolving 
along the imaginary branch from T to T - Since the matrix elements of the time evolution 
operator can be given a path integral representation, it is clear that the ensemble average of any 
operator can also be given a path integral representation in this formalism corresponding to the 
specific contour in the complex time plane as described above. Although the specific contour has 
three branches - one along the real axis increasing with time, the second also along the real axis 
decreasing with time and the third along the negative imaginary axis - in the limit T -4 -oo and 
T' -4 00 , it can be shown that the third branch gets decoupled from the other two (the factors in 
the propagators connecting such branches are asymptotically damped). Consequently, in this limit, 
we are effectively dealing with two branches leading to the name “closed time path formalism” [12]. 
In this contour, then, the time integration has to be thought of as 

P rOO poo 

dt= dt+ — / dt- (94) 

Jc j — 00 J — oo 

where the relative negative sign arises because time is decreasing in the second branch of the time 
contour. 

The doubling of the degrees of freedom, in this formalism, is now clear. To have a path integral 
description, we must specify the fields on both the branches of the contour. Or, equivalently, we 
can use just the positive branch and double the field degrees of freedom. Namely, corresponding 
to every original field, say 0+ , we must introduce a second field (j)^ and remember that the action 
for the <j>^ fields must have a relative negative sign arising from eq. (94), namely, that time is 
decreasing along the second branch. 


Scalar Field Theory 

Just as an example, let us study next the self-interacting scalar field theory in some detail. The 
Lagrangian density is the same as in eq. (30), but following the earlier discussion, we should take 
the complete Lagrangian density for the system to be 

C = ( 95 ) 

where 

m = - ^,^2 _ A ^4 ^ > 0 

The Feynman propagator can again be determined for this theory and would have a 2 x 2 matrix 
structure because of the doubling of the field degrees of freedom. It can be determined subject to 
compatibility with the KMS conditions and has the form in the momentum space 

G++ik) G+_(fc) \ 

G_+(fe) G-^ik) ) 



( 97 ) 



Finite Temperature Field Theory 


399 


with 

G++{k) = ^2 4. jg “ 2iTmB{\k°\)d(k^ - m^) 

— (A:) = —2iT:[6{—k°)+nB{\k^\))S{k^ — m^) 

G-+{k) = -2iTr{e{k°)+nBi\k°\))SiP -m'^) 

= ^ 2 — 2T^-2i^nB(|fc°|)<5(A:2-m2) (98) 

There are several things to note from the structure of this propagator. First, as in the case of 
the propagator in thermo field dynamics, here, too, we see that the propagator naturally is a sum 
of two parts - the temperature independent part and the temperature dependent part. But, more 
interestingly, here the propagator has the simplification that the temperature dependent part of 
every component is the same which leads to various simplifications in actual studies of thermal 
quantities. Furthermore, not all the components of the propagator are independent. In fact, it is 
easily seen that (this can be traced back to their definition) 

G^^{k) + G—{k) = G+^{k) + G-+{k) 


These are known as the causal propagators of the theory and are useful in diagrammatic evaluation. 
There is, of course, another kind of propagator, conventionally known as the physical propagators 
and is defined as 


Gik) = 


0 GAik) \ 
Gnik) Gclk) ) 


(99) 


where Ga^ Gr and Gc are known as the advanced, retarded and the correlated Greens functions. 
These are quite useful in the study of various phenomena such as the linear response theory. The 
important thing to observe is that the causal and the physical propagators are connected through 


a unitary transformation 

G{k)=QGQ^ (100) 


where 




It can be determined from this that, at the tree level, 


( 101 ) 


GAik) = 

1 

kP" —w? — iekP 


Gnik) = 

1 

k"^ —rn? + iek^ 


Gc{k) = 

-2i7r (l + 2nB(|fc°|)) 5{k‘^ - rr?) 

(102) 


as they should be. 

The diagrammatic calculations can now be easily understood in this formalism. The vertices 
can be read out from the Lagrangian density in eq. (95). There are two kinds of vertices, one for 
the original fields, 0+, and the other for the doubled fields, The vertices for the (f)^ fields are 
the same as those for the fields except for a relative sign. With the vertices and the causal 
propagators, one can now carry out the calculation of any observable to any order in perturbation 
theory. As before, we note that, although there is no coupling between the (j)^ and (j>- fields at 
the tree level, higher order corrections would, in general, couple them. 

As an example, let us calculate the one loop mass correction in this theory. There will be 
two such diagrams to calculate — one for the fi^ld and the other for the (j>j- field. The mass 
correction for the </>+ field is readily seen to be 

.. 2 HA) r d^k ,,, 

-,Aml = — 



400 


Ashok Das 




d^k 

(2'kY \k'^ — 'mP' + ie 

2 I A /irt-i 2 > 


2mnB{\k'^\)d{k‘^ - m^) 


= —i{AmQ + Am^) 

where it is easily seen that the temperature independent part has the form 

d^k 


A 2 \ r d?k \ 


while the temperature dependent part is given by 


Am| 


A f riBicok) f ^ 1 1 

2 J (27r)^ LUk 2 J (27r' 


)3 iOk - 1 


(103) 


(104) 


(105) 


These can be compared with the corresponding terms in eq. (35). We can also calculate the mass 
correction for the field. With a little bit of analysis, it is seen that 


Ami = Am^ 


6 Feynman Parameterization 

So far, we have described the various formalisms that can be used to do calculations at finite 
temperature. However, actual calculations lead to many subtle, but interesting features of theories 
at finite temperature. One immediate and obvious feature, of course, is that finite temperature 
effects break Lorentz invariance. Namely, in studying a system at finite temperature, one has to 
go to a specific frame where the heat bath is at rest and, consequently, Lorentz invariance will no 
longer be manifest. This is, of course, already manifest at the level of propagators. For example, 
the structure of the propagators in eqs. (83) or (98) clearly displays a Lorentz non-invariant 
structure. The consequence of this is that an amplitude calculated at finite temperature, say for 
example, the self-energy n(p^,^ depends on the external energy and momentum independently. 
In fact, the self-energy becomes a non-analytic function of these two variables at the origin and 
two different ways of approaching the origin in this space leads to distinct plasmon and screening 
masses [23]. Thus, such non-analyticities are quite physical and their origin can be tranced back to 
the fact that, at finite temperature, there are new channels of reactions possible leading to new 
branch cuts which give rise to such discontinuities [23, 10]. (To be absolutely fair, it is worth noting 
that statistical mechanics can be formulated in a covariant way. In such a case, one finds that 
there is a larger number of Lorentz invariant variables that can be defined on which amplitudes 
can depend. The non-analyticity in and p can then be translated to a non-analyticity in these 
new, Lorentz invariant variables [23].) 

There are, of course, some other kinds of subtlety that arise which influence the calculations 
directly at finite temperature. We will discuss one such subtlety in this section. Let us note that 
a particularly useful formula in the evaluation of amplitudes at zero temperature is the Feynman 
combination formula given by 

1 1 

This can be directly checked by evaluating the x integral on the right hand side. 

This formula is extremely useful and works at zero temperature mainly because the Feynman 
propagators have the same analytic structure, namely, they have the same “ze” dependence. In 
contrast, we note that the finite temperature propagators contain delta functions (see eqs. (83) 
and (98)) and recalling that 


/ 


dx 


[x{A -h ie) {1 - x){B -\r ie)]^ 


(106) 


d(a:) = lim — - 
€->0+ 2z7r 


X — ie rr + ie 



Finite Temperature Field Theory 


401 


we recognize that, at finite temperature, the propagators no longer have the same “ie” dependence. 
Consequently, in evaluating Feynman amplitudes at finite temperature, we have to combine de- 
nominators which do not necessarily have the same “ie” dependence. Keeping this in mind, let us 
examine the combination of two different denominators with arbitrary analytic dependence. 
Without loss of generality, let us choose a,/? = ±1 and note that 



dx 

[x{A -h iat) “h (1 “ + ipe)f 


1 

[ ^ 1 

x{A 4- iae) + (1 - x){B -f- i(ie) ^ 

1 1 
A -\-iae B -f ifie 


(107) 


This is, of course, the standard Feynman combination formula. However, let us note that this will 
not hold if 0 < xo < 1 such that 


Xq 


0-a 


13 A = aB 


(108) 


because, in such a case, the integrand will have a singularity on the real x-axis inside the interval 
" of integration. In this case, we have 



dx 

[x{A A iae) -h (1 - x)(jB 4- iPe)f 


dx 

0 [x{A -B + i{a- /3)e) +B+ ifief 



dx 


[x{A — B + i(a — j3)e) + B + 
1 1 
A + i(X€ B + ijde 
_o- (q: - /3)^(/3 A-aB) 
A-B^i{a- 


(109) 


In other words, when the parameters of the integrand satisfy eq. (108), the Feynman combination 
formula of eq. (106) will modify and the general formula follows from eq. (109) to be [13] 

1 1 _ I ^. ya-l3)S{j3A-aB) 

A + iaeB + i/3e Jo [x{A + iae) + (1 - x){B + il3e)f A-B + i{a-fi)e 

Note that, condition (108) can only be satisfied (with 0 < a:o < 1) if a and /? are of opposite sign. 
Indeed, let us note from eq. (110) that the second term vanishes when a = /? = 1 as is the case 
at zero temperature. Namely, when propagators with identical “ie” dependence are combined, the 
standard combination formula of eq. (106) holds. However, if denominators with opposite “ie” 
dependence axe combined, the correct combination formula involves a second term. This is quite 
crucial at finite temperature and without this second term, one ends up with a wrong result as 
was discovered in finite temperature calculations the hard way [28].. 


7 Large Gauge Invariance 

Gauge theories are beautiful theories which describe physical forces in a natural manner and be- 
cause of their rich structure, the study of gauge theories at finite temperature is quite interesting in 
itself. However, to avoid getting into technicalities, we will not discuss the intricacies of such theo- 
ries either at zero temperature or at finite temperature. Rather, we will discuss a simple quantum 



402 


Ashok Das 


mechanical model, in this section, to bring out some of the new features that temperature brings 
into such theories - features which are very different from what we expect at zero temperature. 

To motivate, let us note that gauge invariance is realized as an internal symmetry in quantum 
mechanical systems. Consequently, we do not expect a macroscopic external surrounding such 
as a heat bath to modify gauge invariance. This is more or less what is also found by explicit 
computations at finite temperature, namely, that gauge invariance and Ward identities continue 
to hold even at finite temperature [29]. This is certainly the case when one is talking about small 
gauge transformations for which the parameters of transformation vanish at infinity. 

However, there is a second class of gauge invariance, commonly known as large gauge invariance 
where the parameters do not vanish at infinity and this brings in some new topological character 
to physical theories. For example, let us consider a 2 + 1 dimensional Chern-Simons theory of the 
form 


^ 'b -^fermion 

= Me>"''^tTAi,{duAx-^AvAx) + ^{'y'^{idf,-9A^)-m)tp ( 111 ) 

where M is a mass parameter, Afj, a matrix valued non- Abelian gauge field and “tr” stands for the 
matrix trace. The first term, on the right hand side, is known as the Chern-Simons term which 
exists only in odd space-time dimensions. We can, of course, also add a Maxwell like term to the 
Lagrangian and, in that case, the Chern-Simons term behaves like a mass term for the gauge field. 
Consequently, such a term is also known as a topological mass term [30] (topological because it 
does not involve the metric). For simplicity of discussion, however, we will not include a Maxwell 
like term to the Lagrangian. 

Under a gauge transformation of the form 

-)■ U-'^A^U--U~'^d^U ( 112 ) 

Q 

it is straightforward to check that the action in eq. (Ill) is not invariant, rather it changes as 

c ^ A^M 

S= (fxC-¥S + ^^^2iTrW (113) 

J Q 

where 

w = J (fxe>^‘'^tid^UU-^d^UU-'^dxUU-^ (114) 

is known as the winding number. It is a topological quantity which is an integer (Basically, the 

fermion Lagrangian density is invariant under the gauge transformations, but the Chern-Simons 

term changes by a total divergence which does not vanish if the gauge transformations do not 

vanish at infinity. Consequently, the winding number counts the number of times the gauge 

transformations wrap around the sphere.). For small gauge transformations, the winding number 

vanishes since the gauge transformations vanish at infinity. 

Let us note from eq. (113) that even though the action is not invariant under a large gauge 

2 

transformation, if M is quantized in units of the change in the action would be a multiple of 
2i'K and, consequently, the path integral would be invariant under a large gauge transformation. 
Thus, we have the constraint coming from the consistency of the theory that the coefficient of 
the Chern-Simons term must be quantized. We have derived this conclusion from an analysis of 
the tree level behavior of the theory and we have to worry if the quantum corrections can change 
the behavior of the theory. At zero temperature, an analysis of the quantum corrections shows 
that the theory continues to be well defined with the tree level quantization of the Chern-Simons 
coefficient provided the number of fermion flavors is even. The even number of fermion flavors is 
also necessary for a global anomaly of the theory to vanish and so, everything is well understood 
at zero temperature. 



Finite Temperature Field Theory 


403 


At finite temperature, however, the situation appears to change drastically. Namely, the 
fermions induce a temperature dependent Chern-Simons term effectively making [31] 


M-^M- 




in 2\m\ 


( 115 ) 


Here, Nf is the number of fermion flavors and this shows that, at zero temperature (y5 oo), 
M changes by an integer (in units of I^Td) for an even number of flavors. However, at finite 
temperature, this becomes a continuous function of temperature and, consequently, it is clear that 
it can no longer be an integer for arbitrary values of the temperature. It seem, therefore, that 
temperature would lead to a breaking of large gauge invariance in such a system. This is, on 
the other hand, completely counter intuitive considering that temperature should have no direct 
influence on gauge invariance of the theory. 


C-S Theory in 0 + 1 Dimension 

As we have noted, Chern-Simons terms can exist in odd space-time dimensions. Consequently, let 
us try to understand this puzzle of large gauge invariance in a simple quantum mechanical theory. 
Let us consider a simple theory of an interacting massive fermion with a Chern-Simons term in 
0 -f- 1 dimension described by [14, 32] 


L = 'ipj {idt — A — m)'ipj — kA 


(116) 


Here, j = 1, 2, • • * , labels the fermion flavors. There are several things to note from this. First, 
we are considering an Abelian gauge field for simplicity. Second, in this simple model, the gauge 
field has no dynamics (in 0 1 dimension the field strength is zero) and, therefore, we do not have 

to get into the intricacies of gauge theories. There is no Dirac matrix in 0 + 1 dimension as well 
making the fermion part of the theory quite simple as well. And, finally, the Chern-Simons term, 
in this case, is a linear field so that we can, in fact, think of the gauge field as an auxiliary field. 

In spite of the simplicity of this theory, it displays a rich structure including all the properties 
of the 2 -h 1 dimensional theory that we have discussed earlier. For example, let us note that under 
a gauge transformation 

^ A-^ A + dtX{t) (117) 


the fermion part of the Lagrangian is invariant, but the Chern-Simons term changes by a total 
derivative giving 

5 = j dtL-^S- 2nKN (118) 

where 

N = ^ [ dtdtX{t) (119) 

27r J 


is the winding number and is an integer which vanishes for small gauge transformations. Let us 
note that a large gauge transformation can have a parametric form of the form, say, 




( 120 ) 


The fact that N has to be an integer can be easily seen to arise from the requirement of single- 
valuedness for the fermion field. Once again, in light of our earlier discussion, it is clear firom 
eq. (118) that the theory is meaningful only if k, the coefficient of the Chern-Simons term, is an 
integer. 

Let us assume, for simplicity, that m > 0 and compute the correction to the photon one-point 
function arising from the fermion loop at zero temperature. 


ili = - 



dk i{k + m) 
27r k^ — -f ie 


iNf 

2 


( 121 ) 



404 


Ashok Das 


This shows that, as a result of the quantum correction, the coefEcient of the Chern-Simons term 
would change as 

Nf 

As in 2 -f 1 dimensions, it is clear that the coefficient of the Chern-Simons term would continue to 
be quantized and large gauge invariance would hold if the number of fermion flavors is even. At 
zero temperature, we can also calculate the higher point functions due to the fermions in the theory 
and they all vanish. This has a simple explanation following from the small gauge invariance of the 
theory. Namely, suppose we had a nonzero two point function, then, it would imply a quadratic 
term in the effective action of the form 

r2 = 2 y dtidt2A(ti)F{ti —t2)A{t2) (122) 

Furthermore, invariance under a small gauge transformation would imply 

ST 2 “ y — i2)A(t2) = 0 (123) 

The solution to this equation is that F — Oso that there cannot be a quadratic term in the effective 
action which would be local and yet be invariant under small gauge transformations. A similar 
analysis would show that small gauge invariance does not allow any higher point function to exist 
at zero temperature. 

Let us also note that eq. (123) has another solution, namely, 

F{ti - ^ 2 ) = constant 

In such a case, however, the quadratic action becomes non-extensive, namely, it is the square of 
an action. We do not expect such terms to arise at zero temperature and hence the constant has 
to vanish for vanishing temperature. As we will see next, the constant does not have to vanish at 
finite temperature and we can have non- vanishing higher point functions implying a non-extensive 
structure of the effective action. 

The fermion propagator at finite temperature (in the real time formalism) has the form [10] 
5(p) = (j) + m)(-^ \ ■ - 27rnf (|p|)(?(p^ - m^)) 

\P‘^ — -hie / 

~ Z — ~ ‘^'^^Firn)5{p - m) (124) 

p — m -h ze 

and the structure of the effective action can be studied in the momentum space in a straightfor- 
ward manner. However, in this simple model, it is much easier to analyze the amplitudes in the 
coordinate space. Let us note that the coordinate space structure of the fermion propagator is 
quite simple, namely, 

5(t) = f ( : — 2nnj[p{m)d(p--m)^ = {9{t) - (125) 

J ZTT \p — m -b te J 

In fact, the calculation of the one point function is trivial now 

ih = ^ tanh ^ (126) 

This shows that the behavior of this theory is completely parallel to the 2 + 1 dimensional theory 
in that, it would suggest 

2 2 

and it would appear that large gauge invariance would not hold at finite temperature. 



Finite Temperature Field Theory 


405 


Let us next calculate the two point function at finite temperature. 

il, = 

Nf 

= — ^np{m){l-nF(m)) 

8 2 2 2\p dm 


(127) 


This shows that the two point function is a constant as we had noted earlier implying that the 
quadratic term in the effective action would be non-extensive. 

Similarly, we can also calculate the three point function trivially and it has the form 


ih 


= !^,anh^sech=^ 
24 2 2 




dHih) 

dw? 


(128) 


In fact, all the higher point functions can be worked out in a systematic manner. But, let us 
observe a simple method of computation for these. We note that because of the gauge invariance 
(Ward identity), the amplitudes cannot depend on the external time coordinates as is clear from 
the calculations of the lower point functions. Therefore, we can always simplify the calculation 
by choosing a particular time ordering convenient to us. Second, since we are evaluating a loop 
diagram (a fermion loop) the initial and the final time coordinates are the same and, consequently, 
the phase factors in the propagator (125) drop out. Therefore, let us define a simplified propagator 
without the phase factor as 

S{t)=6{t)-nF{m) (129) 

so that we have _ 

S{t > 0) = 1 - nj?(m), S{i < 0) = -npim) (130) 

Then, it is clear that with the choice of the time ordering, ti >t 2 , we can write 


dS(ti -t2) 
dm 

dS{t2-h) 

dm 


—/3S{ti — ts)S(t^ — ^ 2 ) ^1 > ^2 > ^3 

-/3S{t2 - ts)S(t3 — ti) h >t2 > ts 


(131) 


In other words, this shows that differentiation of a fermionic propagator with respect to the 
mass of the fermion is equivalent to introducing an external photon vertex (and, therefore, another 
fermion propagator as well) up to constants. This is the analogue of the Ward identity in QED 
in four dimensions except that it is much simpler. Erom this relation, it is clear that if we take a 
n-point function and differentiate this with respect to the fermion mass, then, that is equivalent 
to adding another external photon vertex in all possible positions. Namely, it should give us the 
(n + l)-point function up to constants. Working out the details, we have, 


dm 


+ l)Jn+i 


(132) 


Therefore, the (n + l)-point function is related to the n-point function recursively and, conse- 
quently, all the amplitudes are related to the one point function which we have already calculated. 
(Incidentally, this is already reflected in eqs. (127,128)). 

With this, we can now determine the full effective action of the theory at finite temperature to 
be 


r 


-i a" (iln) 

n 




a . , 0m . a\ 

cos - + 1 tanh — sin - j 


( 133 ) 



406 


Ashok Das 


where we have defined 

a = y* dtA{t) (134) 

There are several things to note from this result. First of all, the higher point functions are 
no longer vanishing at finite temperature and give rise to a non-extensive structure of the effective 
action. More importantly, when we include all the higher point functions, the complete effective 
action is invariant under large gauge transformations, namely, under 

a a + 27tN (135) 

the effective action changes as 

r-^T + NNfTT (136) 

which leaves the path integral invariant for an even number of fermion flavors. This clarifies the 
puzzle of large gauge invariance at finite temperature in this model. Namely, when we are talking 
about large changes (large gauge transformations), we cannot ignore higher order terms if they 
exist. This may provide a resolution to the large gauge invariance puzzle in the 2 -h 1 dimensional 
theory as well. However, in spite of several nice analysis [33], this puzzle has not yet been settled 
in all its generality in the 2 4-1 dimensional case. 

Exact Result 

In the earlier section, we discussed a perturbative method of calculating the effective action at 
finite temperature which clarified the puzzle of large gauge invariance. However, this quantum 
mechanical model is simple enough that we can also evaluate the effective action directly and, 
therefore, it is worth asking how the perturbative calculations compare with the exact result. 

The exact evaluation of the effective action can be done easily using the imaginary time for- 
malism. But, first, let us note that the fermionic part of the Lagrangian in eq. (116) has the 
form 

Lf = 'ipiidt - A — (137) 

where we have suppressed the fermion flavor index for simplicity. Let us note that if we make a 
field redefinition of the form 

= (138) 

then, the fermionic part of the Lagrangian becomes free, namely, 

L/ = -0 {idi — m)'0 (139) 

This is a free theory and, therefore, the path integral can be easily evaluated. However, we have 
to remember that the field redefinition in (138) changes the periodicity condition for the fermion 
fields. Since the original fermion field was expected to satisfy anti-periodicity 

^il^) = -' 0 ( 0 ) 

it follows now that the new fields must satisfy 

t^(/3) = -e“*“t^(0) (140) 

Consequently, the path integral for the free theory (139) has to be evaluated subject to the peri- 
odicity condition of (140). 

Although the periodicity condition (140) appears to be complicated, it is well known that the 
effect can be absorbed by introducing a chemical potential [10], in the present case, of the form 



Finite Temperature Field Theory 


407 


With the addition of this chemical potential, the path integral can be evaluated subject to the 
usual anti-periodicity condition. The effective action can now be easily determined 


r 


m + f) 


f det (idt 

* t (i^t -m) J 


—iNf log 


V 


Nf 


cosh ^ J 

a . , 

cos - + 2 tann 

jL At 


sm ■ 


(142) 


which coincides with the perturbative result of eq. (134). 


8 Supersymmetry Breaking 

One of the reasons for studying finite temperature field theory is to understand questions such 
as phase transitions in such systems. It is by now well understood that most field theoretic 
models of spontaneous symmetry breaking display a phase structure much like what one sees in 
a magnet, namely, above a certain critical temperature, the system is in a symmetric phase while 
below the critical temperature, the system is in a broken symmetry phase. Thus, temperature 
has the almost universal effect that if a symmetry is spontaneously broken at low temperature, it 
is restored at temperatures above a certain critical value. Qualitatively, it can be understood as 
follows. Temperature, particularly high temperature, provides a lot of thermal energy to a physical 
system to wash out any structure in the zero temperature potential which may be responsible for 
symmetry breaking. There is, however, one class of symmetries where temperature has the inverse 
effect, namely, in a supersymmetric theory, a symmetric phase at low temperature goes to a broken 
phase at high temperature. (Of course, if supersymmetry is broken at low temperature, it continues 
to be broken even at high temperature.) We will discuss this phenomenon with a simple quantum 
mechanical model in this section. 

Supersymmetric Oscillator at T = 0 

Let us note that supersymmetry is an ultimate form of symmetry that one can dream of, namely, 
it transforms bosons into fermions and vice versa [34-35]. To introduce supersymmetry, let us 
consider a simple quantum mechanical model, commonly known as the supersymmetric oscillator 
[16]. It consists of a bosonic and a fermionic oscillator of the same frequency. .Therefore, we can 
write the Hamiltonian, for the system as 

H = Hb + Hp = oj -f Opap^ (143) 

where ap and ap describe, respectively, the bosonic and the fermionic annihilation operators. 

The immediate thing to note from the structure of the Hamiltonian in eq. (143) is that there 
is no zero point energy. We will see this shortly as a general feature of supersymmetric theories. 
Let us also define two fermionic operators of the form 

Q ~ ci^pCiPj Q ~ (144) 

With the usual canonical commutation relations for the bosonic operators (see eq. (66)) and 
anti-commutation relations for the fermionic operators (see eq. (47)), it is easy to check that 

[Q,H] = 0=\Q,H] 

Namely, these fermionic operators are conserved. In fact, together with the Hamiltonian, they 
satisfy the algebra (it is straightforward to check this) 

[Q,H] = 0=[Q,ff] 



408 


Ashok Das 


[Q,QU = o=[g,^^ 

[QM^ = iff (145) 

Such an algebra, where both commutators and anti-commutators are involved (or alternately, 
where there is a grading of the multiplication rule of the algebra), is known as a graded Lie algebra 
and supersymmetric theories are realizations of graded Lie algebras. 

As we know from the study of symmetries, conserved quantities generate infinitesimal symme- 
tries of the theory. Since both Q and Q are conserved, it is worth asking what kind of symmetry 
transformations of the theory they generate. In fact, let us keep in mind that they are fermionic 
operators and hence the symmetry they will generate cannot be conventional. Explicitly, we can 
check that 


[Q,aB 

= 0=[g,aF]. 

lQ,aB] 

= -ap 

[q ) 


\Q, 0,b] 

= 0 = j^g, 

Q,o,^g 

= Up 

{Q> «/■]+ 

= UB 


Namely, Q and Q take bosonic operators to fermionic ones and vice versa which is the bench 
mark of supersymmetry. Thus, our Hamiltonian in eq. (143) is invariant under supersymmetric 
transformations of the form (146). 

There are several things to note from the structure of the supersymmetry algebra in eq. (145). 
First, the energy eigenvalues of our supersymmetric theory have to be positive semi-definite since 
the operator on the left hand side of the last relation in (145) is. Furthermore, if the ground state 
is supersymmetric satisfying 

g|0) = 0 = Q|0) (147) 

then, the ground state will have vanishing energy, as we had pointed out earlier as the case for our 
system. Both these results are, in fact, quite general for any supersymmetric theory. We also note 
from the structure of the algebra that the spectrum of the Hamiltonian will be doubly degenerate 
^cept for the ground state. Namely, if |t/j} is an eigenstate of the Hamiltonian, then, Ql'i/j) (or, 
Ql'ip) - only one of them would be nontrivial depending on the form of |^)) would be degenerate 
in energy. 

Let us, in fact, examine some of these general results explicitly. The spectrum of the Hamilto- 
nian in eq. (143) is, in fact, quite straightforward. The Hilbert space is a product space containing 
bosonic and fermionic oscillator states and a general state has the structure 

\nB \nF 

InB^np) = in^) 0 Inp) = - 

with energy eigenvalues 

^ns^riF = ^(jiB+nF)i Tip = 0, 1; = 0, 1,2, •• • (149) 

where the ground state is expected to satisfy 

aB\0) =0 = ai7’|0) (150) 

We note that an immediate consequence of (150) is that 

g|o) = o = g|o) 



Finite Temperature Field Theory 


409 


and, consequently, the ground state is supersymmetric and that the ground state energy is seen 
from (149) to vanish. All the higher states have positive energy. Furthermore, we note that all the 
states (except the ground state) _of the form |nB,0) and {tib — 1, 1) are degenerate in energy. Let 
us also note the effect of Q and Q acting on the states of the Hilbert space, namely, 


QlnB^np) 


/riB + 1 \nB -f l.njfp 
0 


1) if ni? 7^ 0 
if rii? = 0 


QlnB^np) 


\nB -l,nF + 1) if tib ^ Oor, # 1 
0 if = Oor, nji? = 1 


(151) 


Supersymmetric Oscillator at T 7^ 0 

Let us next analyze the supersymmetric oscillator at finite temperature in the formalism of thermo 
field dynamics. As we had noted earlier, this is the ideal setting to discuss questions such as symme- 
try breaking. Let us note, even before carrying out the calculations, that we expect supersymmetry 
to be broken at finite temperature. Intuitively, this is quite clear. Namely, supersymmetry takes 
bosons to fermions and vice versa and, consequently, any boundary condition that distinguishes 
between the two would lead to a breaking of this symmetry. Temperature, in fact, introduces 
such a condition, namely, bosons and fermions behave differently at finite temperature (they obey 
distinctly different statistics). However, what is not clear a priori is whether such a breaking would 
be explicit or spontaneous. 

To study the system at finite temperature within the framework of thermo field dynamics, let 
us look at the complete system, including the tilde oscillators, described by 

H = if — If = uj{a^^aB + a^rdp) — u){a}^aB + S^^f) (152) 

The Hilbert space of the doubled system has the structure 

\nB,nF;7iB,^F) = \tib,tif) (S>\nB,nF) ( 153 ) 

The thermal vacuum can now be defined (as discussed in section 3). Let us define 

G{6b,0f) = -iOB{0)iaBCLB - cl^b^b) ^ ^‘^F(/5)(dFOF - a^a^) 

with (see eqs. (58)and (72)) 

tanepW) = = iaiiheBi^) ( 154 ) 

Then, the thermal vacuum can be defined as 

( 155 ) 

This also allows us to calculate the thermal operators in a straightforward manner. 

Let us note next that the expectation value of the Hamiltonian in the thermal vacuum is given 
by 

EoW) = (0,/J|ff|0,/3) = (0,/3|a;(a^aB+atflf)|0>/3> 

= w(smh^ 63(13) + sxa^ 6 f{P)) = 

This shows that the energy of the thermal vacuum is nonzero for any finite temperature signaling 
that supersymmetry is broken. Furthermore, let us note that 

Q\0,I3) = o^Oir|0,/3) = = l,nF(/3) = 0; 

njsC^) = Q,nF{$) = 1) 

Q\O,0) = a^paB\0,p) = -j===^\nB{/3) = 0,nFifi) = l; 
nB(0) = l,nF(/3) = 0) 


(157) 



410 


Ashok Das 


This, in fact, shows that supersymmetry breaking is spontaneous at finite temperature and the new 
states on the right hand side of (157) would correspond to the appropriate quasi particle Goldstino 
states associated with such a symmetry breaking. 

There are various other order parameters for the breaking of supersymmetry and all of them 
lead to the same conclusion that supersymmetry is spontaneously broken at finite temperature 
[16]. 

9 Conclusion 

In this article, we have tried to describe some of the interesting features of finite temperature field 
theories. There are, of course, many more topics that we have not been able to discuss. However, 
it is our hope that the topics discussed, in this article, would raise the curiosity of the readers to 
pursue various other questions in this field. 

This work was supported in part by the U.S. Dept, of Energy Grant DE-FG 02-91ER40685. 


References 

[1] T. Matsubara, Prog. Theor. Phys. 14 (1955) 351. 

[2] J. Schwinger, J. Math. Phys. 2 (1961) 407; J. Schwinger, Lecture Notes Of Brandeis University 
Summer Institute (1960). 

[3] H. Umezawa, H.Matsumoto and M. Tachiki, Thero Field Dynamics and Condensed States, 
North-Holland, Amsterdam, 1982. 

[4] D. A. Kirzhnits and A. D. Linde, Phys. Lett. 42B (1979) 471; L. Dolan and R. Jackiw, Phys. 
Rev. D9 (1974) 3320; S. Weinberg, Phys. Rev. D9 (1974) 3357. 

[5] D. J. Gross, R. D. Pisarski and L. G. YafFe, Rev. Mod. Phys. 53 (1981) 43. 

[6] A. A. Anselm, Phys. Lett B217 (1989) 169; A. A. Anselm and M. G. Ryskin, Phys. Lett. 
B266 (1991) 482; J. D. Bjorken, Int J. Mod. Phys. A7 (1992) 4189; J. P. Blaizot and A. 
Krzywicki, Phys. Rev. D46 (1992) 246; K. Rajagopal and F. Wilczek, Nuc. Phys. B204 (1993) 
577; P. F. Bedaque and A. Das, Mod. Phys. Lett. A8 (1993) 3151. 

[7] A. Das, Field Theory ^ A Path Integral Approach, World Scientific (1993). 

[8] A. L. Fetter and J. D. Walecka, Quantum Theory of Many Particle Systems, McGraw-Hill 
(1971); A. A. Abrikosov, L. P. Gorkov and 1. E. Dzyaloshinski, Methods of Quantum Field 
Theory in Statistical Physics, Dover (1975). 

[9] J. 1. Kapusta, Finite Temperature Field Theory, Cambridge University Press (1989); M. Le 
Bellac, Thermal Field Theory, Cambridge University Press (1996). 

[10] A. Das, Finite Temperature Field Theory, World Scientific (1997). 

[11] N. P. Landsman and C. G. van Weert, Phys. Rep. 145 (1987) 141. 

[12] P. M. Bakshi and K. T. Mahanthappa, J. Math. Phys. 4 (1963) 1; L. V. Keldysh, Sov. Phys. 
JETP 20 (1965) 1018; K. C. Chou et al, Phys. Rep. 118 (1985) 1. 

[13] H. A. Weldon, Phys. Rev. D47 (1993) 594; P. F. Bedaque and A. Das, Phys. Rev. D47 (1993) 
601. 

[14] G. Dunne, K. Lee and C. Lu, Phys. Rev. Lett. 78 (1997) 3434. 

[15] A. Das and G. Dunne, Phys. Rev. D57 (1998) 5023. 



Finite Temperature Field Theory 


411 


[16] A. Das and M. Kaku, Phys. Rev. D18 (1978) 4540; A. Das, A. Kharev and V. S. Mathur, 
Phys. Lett. B181 (1986) 299; A. Das and V. S. Mathur, Phys. Rev. D35 (1987) 2053; A. Das, 
Physica A158 (1989) 1. 

[17] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals, McGraw-Hill (1965). 

[18] L. S. Shulman, Techniques and Applications of Path Integration, John-Wiley (1981); H. Klein- 
ert, Path Integrals^ World Scientific (1995). 

[19] F. A. Berezin, The Method of Second Quantization^ Academic Press (1966); B. DeWitt, Su- 
permanifolds, Cambridge University Press (1984). 

[20] R. Kubo, J. Phys. Soc. Japan, 12 (1957) 570; P. Martin and J. Schwinger, Phys. Rev. 115 
(1959) 1342. 

[21] C. Bloch, Nuc. Phys. 7 (1958) 451. 

[22] A. A. Abrikosov, L. P. Gorkov and I. E. Dzyaloshinski, Sov. Phys. JETP 9 (1959) 636; H. 
Umezawa, Y. Tomozawa and H. Ezawa, Nuovo Cim. 5 (1957) 810. 

[23] H. A. Weldon, Phys. Rev. D26 (1982) 1394; ibid D28 (1983) 2007. 

[24] G. Baym and N. Mermin, J. Math. Phys. 2 (1961) 232. 

[25] Y. Takahashi and H. Umezawa, Collective Phenomena 2 (1975) 55. 

[26] 1. Ojima, Ann. Phys. 137 (1981) 1. 

[27] A. J. Niemi and G. SemenofF, Ann. Phys. 152 (1984) 105. 

[28] P. F. Bedaque and A. Das, Phys. Rev. D45 (1992) 2906. 

[29] A. Das and M. Hott, Mod. Phys. Lett. A9 (1994) 3383. 

[30] S. Deser, R. Jackiw and S. Templeton, Ann. Phys. 140 (1982) 372. 

[31] K. S. Babu, A. Das and P. Panigrahi, Phys. Rev. D36 (1987) 3725; I. Aitchison and J. Zuk, 
Ann. Phys. 242 (1995) 77; N. Bralic, C. Fosco and F. Schaposnik, Phys. Lett. B383 (1996) 
199; D. cabra, E. Fradkin, G. Rossini and F. Schaposnik, Phys. Lett. B383 (1996) 434. 

[32] G. Dunne, R. Jackiw and C. Trugenberger, Phys. Rev. D41 (1990) 661. 

[33] S. Deser, L. Griguolo and D. Seminara, Phys. Rev. Lett. 79 (1997) 1976; C. Fosco, G. Rossini 
and F. Schaposnik, Phys. Rev. 79 (1997) 1980; S. Deser, L. Griguolo and D. Seminara, Phys. 
Rev. D57 (1998) 7444; C. Fosco, G. Rossini and F. Schaposnik, Phys. Rev. D56 (1997) 6547. 

[34] Y. A. Gel’fand and E. P. Likhtman, JETP Lett 13 (1971) 323; P. Ramond, Phys. Rev. D3 
(1971) 2415; A. Neveu and J. Schwarz, Nuc. Phys. 31 (1971) 86; D. Volkov and V. Akulov, 
Phys. Lett. B46 (1974) 109; J. Wess and B. Zumino, Nuc. Phys. 70 (1974) 39. 

[35] P. Fayet and S. Ferrara, Phys. Rep. 32 (1977) 249; M. F. Sohnius, Phys. Rep. 128 (1985) 39. 



16 . Integrable Models And The Toda Lattice 

Hierarchy 


Bani M Sodermark * 

Dept of Engineering Sciences, Physics and Mathematics, 
Karlstad University, 65188 Karlstad, SWEDEN 


Abstract 

A pedagogical presentation of integrable models with special reference to the Toda lattice 
hierarchy has been attempted. The example of the KdV equation has been studied in detail, 
beginning with the infinite conserved quantities and going on to the Lax formalism for the 
same. We then go on to symplectic manifolds for which we construct the Lax operator. This 
formalism is applied to Toda Lattice systems. The Zakharov Shabat formalism aimed at 
encompassing all integrable models is also covered after which the zero curvature condition 
and its fallout are discussed. We then take up Toda Field Theories and their connection to 
W algebras via the Hamiltonian reduction of the WZNW model. Finally, we dwell on the 
connection between four dimensional Yang Mills theories and the KdV equation along with a 
generalization to supersymmetry. 


1 Introduction: Non-Linear Equations 


Linear partial differential equations, in particular the Schroedinger, Klein-Gordon and Dirac equa- 
tions, have been known in field theory over a long time, and have been used in many different 
problems with great success. Non-linear equations, i.e., equations where the potential term is non- 
linear in the field (5), have been known for some time as well. These equations and their solutions 
are the topic of the present Article. 

The earliest non-linear wave equations known in physics were the Liouville and Sine-Gordon 
equations. The Liouville equation arose in the context of a search for a manifold with constant 
curvature. Pictorially, such parametrizations may be likened to covering a surface with a fishing 
net. Since the knots on the fishing net do not move, the arc length is constant. The threads in the 
net correspond to a local coordinate system on the surface. 

The Liouville manifolds may be reparametrized locally so as to have a metric of the form: 


^ ) ( 1 . 1 ) 

0 exppj 

so as to be conformally equivalent to a flat space metric. The study of such manifolds with constant 
curvature led J. Liouville [1] to the equation known by his name: 


d^p 

dxdy 


= expp 


( 1 . 2 ) 


X and y being local orthogonal coordinates. Interest in this equation was renewed in the 70’s and 
80’s due to its appearance in string theories [2,3,4]. 

The Sine-Gordon equation, named after a pun on the Klein-Gordon equation, is an equation 
for the angle cj between two coordinate lines when the total curvature is constant and negative. 


'Email: bani.sodermark@kau.se 



Integrable Models and Toda Hierarchy 


413 


This equation first appeared in the work of Enneper in 1870, and has the form: 


dxdy 


= sin(o;) 


(1.3) 


where x and y are coordinates in a system with constant arc length. 

The Sine-Gordon equation has some interesting solutions known as solitons and breathers. A 
soliton satisfies three conditions. First, a single soliton must have constant shape and velocity. 
Secondly, it must be localized, and its derivative must vanish at infinity. Thirdly, if two solutions 
collide, they should survive the collision with their shapes unchanged. 

Principally, there are two types of solitons, one which increases by a fixed amount (say 27 r), 
and is called a ‘kink’; the other which decreases by the same amount, and is called an ‘anti-kink’. 

A breather is a localized solution that varies periodically, and could be considered as a per- 
manently bound system of a kink and anti-kink. 

An interesting property of the Sine-Gordon equation is that its solutions can be mapped into 
others through the Baecklund transformation [5], and can thus be used to create new solutions 
from known solutions. It is however impossible to generate a complete set of solutions from one 
original solution, via the Baecklund transformation [5]. 

A third non-linear equation which we shall study in some detail, was discovered in 1895 by 
D.J.Korteweg and G. de Vries [6], while trying to describe the motion of water-waves in a canal. 
It has the form: 

'Ojf buUx “b y>xxx “ 0 (^*'^) 


and is also known as the KdV equation. It has been extensively studied, and many of the properties 
of non-linear wave equations that are known today, were discovered in connection with its solution. 
This equation was solved by Gardner, Greene, Kruskal, and Miura in 1967 [7-13]. Along with N.J. 
Zabusky and C. H. Su, they also found many interesting properties of the same. One of these is that 
the KdV equation has an infinite number of conservation laws, and that the conserved quantities 
of each of these laws can be used as a Hamiltonian for an integrable system. This collection of 
Hamiltonians is called the KdV hierarchy. 

There exists a theorem of classical mechanics, which states that if a Hamiltonian system with 
2n degrees of freedom has n functionally independent conserved quantities such that the Poisson 
bracket of any two of them vanishes, i.e.,the integrals of motion are in ‘involution’, the system is 
completely integrable. It is clear that solutions of systems with an infinite number of conserved 
quantities must be infinitely restricted. A soliton is precisely such a solution: it is a localized wave 
which retains its shape even after collisions. Intuitively, it is clear that for this to happen, there 
must be an infinite number of conservation laws, and therefore an infinite number of conserved 
quantities. The terms ‘integrable models’ and ‘solitons’ are often used synonymously. 

A system of coupled equations of motion describing a 1-dimensional crystal with non-linear 
coupling between nearest neighbour atoms, was introduced by M.Toda [14] in 1967. The equations 
of motion are 

= a[2e-^« - ~ 6^^^^+^] (1.5) 


where rn = Un+i - and Unit) is the longitudinal displacement of the n-th atom with mass 
m from its equilibrium position, a being a constant. These models admit soliton solutions which 
have been studied experimentally on an electrical network by K. Hitota and K. Suzuki [15]. In the 
continuunm limit, these equations reduce to the KdV equation [5]. 

We see that models with exponential interactions are a source of non-linear equations, the 
Liouville and Sine-Gordon equations being examples. The Liouville equation could be generalized 


to include a mass term: 


ay 

dxdy 


-h m^(j> = 


( 1 . 6 ) 


while the Sine-Gordon equation could be generalized to the ”Sinh-Gordon” equation with the 
replacement lo -> iuj. Thus 

" -f m u = sinha; 
dxdy 


(1.7) 



414 


Bam M Sodermaxk 


We also have the Toda Field Theory equations 


dxdy 


( 1 . 8 ) 


Here hj is the Cartan matrix for some complex Lie Algebra. The simplest of these field theories 
is the Ar Toda field theory, and it includes the Liouville field theory for the special case r = 1. 
There exist generalizations of the Toda equations called ’’Affine Toda Equations”, and have an 
extra term on the RHS, taking the form: 


oxoy 


(1.9) 


Here K is an aflSne Cartan matrix, and Ri the right null vector for this matrix when Rq is normalized 
to unity. 

These models include the Sinh-Gordon equation as a special case. Both the Toda and Affine 
Toda field theories have an infinite number of conserved quantities [16]. They admit soliton solu- 
tions with an imaginary 4>i [17]. Both models have been formally solved by Leznov and Saveliev 
[18]. 

The Toda field theories can be obtained from the Toda Lattice by setting 


{<t>i+l ~ <t^i) 


( 1 . 10 ) 


whence 


__ 

dt^ dx^ 


-[2e^‘ 


( 1 . 11 ) 


for SU(n + 1), showing that the space-independent solutions of (1.11) satisfy (1.5). 

Since the Toda field theories are the 7 = 0 limits of the Affine Toda field theories, they could 
be used to classify 2-dimensional models with a second order phase transition, with the Toda field 
theory describing the model at the critical point where it has to be conformally invariant [19]. 
Hence the great interest in (Affine) Toda field theories. However the precise connection is still 
unclear. Central charges and critical exponents have been calculated and compared. One hopes 
that the Affine Toda field theories are perturbations that correspond to the physical model away 
from the critical point. However, more explicit connections are yet to be found. 

The method originally used for solving non-linear equations, and especially the KdV equation, 
was the inverse scattering method originated by Gelfand and Levitan [20]. This involved looking 
for a linear equation related to the original non-linear equation, and studying the evolution of the 
latter. In 1968, P.Lax provided this method within a solid theoretical framework [21]. The Lax 
equation is 

i, + [L,M] = 0 (1.12) 


where L and M are operators satisfying 


L'lp = 


(1.13) 


and 

(1.14) 

where A is a scalar, and -0 a solution of a linear equation which is just the Schroedinger equation 
for the KdV case ! The Lax equation was generalized to the form of a zero curvature condition 
which facilitates greatly the form of the transition matrix from the initial to the final state. 

In what follows, we attempt to give a pedagogical presentation of Integrable Systems with 
special emphasis on the KdV and Toda systems. After an introduction to the KdV equation 
and its properties, we show how an infinite number of conserved quantities arise via the Muira 
[8] transformation, while detailed calculations are referred to ref. [22]. We then dwell on solutions 
of the KdV equation via the inverse scattering method and the Lax formalism [21], after which 
we obtain the Lax operator for symplectic manifolds, using the Toda Lattice as an example. The 



Integrable Models and Toda Hierarchy 


415 


group structure of the Toda equations for SU (N) is also studied. The Lax transformation was later 
generalized by Zakharov and Shabat [23] to a first order formalism which was used by Ablowitz, 
Kamp, Newell and Segur (AKNS) [24], for a unified description of other integrable models. The 
essential features of this approach are also discussed. A fall-out of the above is the ‘zero curvature 
condition’ that facilitates the transition to the quantum case. However, the treatment we follow is 
strictly classical. 

Next we take up the Toda field theories, and after reporting briefly the connection with 
conformal invariance, dwell on the Hamiltonian reduction of the WZNW model to the Toda field 
theory, which in effect transforms an affine Lie Algebra to a W-Algebra. (Most calculational details 
are skipped, but may be found in the literature [25]). Finally we refer to the interesting connection 
between the 4D self-dual Yang-Mills theory and 2D Integrable models, and the generalization to 
SuperSymmetry. 

The material is presented as follows. In Sect.2, we introduce the KdV equation and its 
conserved quantities. In Sect.3, solutions of non-linear equations are taken up, in particular the 
inverse scattering method and the Lax formalism. In Sect.4, we digress to Symplectic Manifolds 
and construct conserved quantities for these manifolds. Sect.5 applies the above framework to the 
Toda Lattice where the group structure of the Toda equations is also discussed. In Sect. 6, we 
take up the unifying first order formalism of Zakharov and Shabat [23], continuing in Sect. 7 to the 
zero curvature formalism and its ramifications. In Sect.8, we take up Conformal Invariance, and 
introduce Toda Field Theories which are constructed independently of the Toda Lattice. In Sect. 9, 
we carry out the Hamiltonian reduction of the WZNW model to Toda Field Theories. Finally 
in Sect. 10, we take up the connection of Toda Field Theories with Self-dual Yang-Mills models. 
Sect. 11 contains some concluding remarks. 


2 The KdV Equation 


The KdV equation was formulated to explain the solitary water waves observed by J. Scott Russell 
in the Edinburgh Glassgow canal. It is a non-linear equation in one space and one time dimension 
and possesses soliton solutions. Of this, however, nothing was known at the time of its formation. 
The KdV equation after an initial scaling takes the form 


du _ du d^u 
dt '^dx'^ dx^ 


(2.1) 


This equation is Galilean invariant, but not Lorentz invariant. It can be derived from the 
Hamiltonian 


where the u{x) satisfy the Poisson bracket relations 

[u{x),u{y)] ==dxSix-y) 


(2.3) 


However, the Lagrangian from which it can be derived, is non local: 

LKdv = 5 y dxdyu{x)(.{x - - J dxi^ - (2-4) 

where 

e{x -y) =6[x -y) - -y (2*5) 

6 being the step function. Ergo, one cannot write down a local Lagrangian whose Euler-Lagrange 
equations yield the KdV equation. 

Solutions of the KdV equation can be shown to be soliton solutions which travel without any 
change of shape. It is the non-linear term which is responsible for the above property. 



416 


Bani M Sodermark 


What is most interesting about the KdV equation is that it admits of an infinite number of 
conserved quantities as was shown by Miura [8]. This procedure is explained below. 

The KdV equation is related to another equation called the modified KdV (MKdV) equation, 


viz., 


where v is related to u in 


dv __ 2 ^^ 
dt ^ dx^ dx^ 

the KdV equation through the Riccati transformation 


( 2 . 6 ) 


ox 


(2.7) 


The MKdV equation is however not Galiliean invariant. Under the transformation 

3t 3 


t -> t; x^x + -^-, n 


ve \/6 

■\/6 2e 


it reduces to 




dtv = (— g — I- v)dxV + d^v — 


This yields a solution of the KdV equation through the transformation 

u = e^u^/6 + u + iedxV 


( 2 . 8 ) 

(2.9) 

( 2 . 10 ) 


The second form of (2.9) is in the nature of a continuity equation, so that we can identify 

/ +00 

dxvix{t)) (2.11) 

-OO 

as the conserved quantities, v can be inverted in terms of u as 

V = f2e-Vniu{x,t)) (2.12) 

0 

and this yields Vn{u{Xyt)) as the conserved densities, since each power of e must independently 
satisfy a continuity equation. That these are also in involution can also be checked, being explicitly 
shown by Das [22]. Some of the conserved quantities are 

2 2 

Vi = -idxUi] V2 = - d^U] V3 = idx\^ + dlu] (2.13) 

D o 


3 The Lax Framework 

Linear Hamiltonian systems with fixed initial value problems can be solved using the Laplace 
or Fourier transformations. Such methods are inapplicable for the nonlinear equations and new 
methods must be found. Gardner, Green, Krushal and Miura [9] managed to solve the initial 
value problem for the KdV equation in a very ingenious way. In subsequent years, this method 
has become the standard method for solving non-linear systems and goes by the name of inverse 
scattering theory [20,21]. This method is outlined in Fig 1. 

The inititil value for the partial differential equation is used as the potential in a 1-dimensional 
scattering problem for a linear equation, e.g. the Schroedinger equation. One then finds the so 
called scattering data, i.e. discrete spectrum, normalization constants, reflection constants (as a 
function of the wave number) for this scattering problem. Using the partial differential equation 
(pde) evaluated for |x| asymptotically large, (and hence the pde becomes a linear equation because 
the potential is assumed to vanish at spatial infinity), the values of the scattering data can be 
found for all later times. Finally, the scattering data allow one to reconstruct the potential, and 
hence the solution of the pde for any later time. 



Integrahle Models and Toda Hierarchy 


417 


Figure 1: 



One would intuitively like a better understanding of the origin and relevance of the linear 
Schroedinger equation. One way to see this is through a generalized Riccati relation of the form: 


i 4- 6A = 4- i\/6 


dv 

dx 


so that the KdV relation ‘(2.1) reduces to 


dv -2 d^v 

dt dx^ ° 


(3.1) 


(3.2) 


As mentioned earlier, a solution of the MKdV equation yields a solution of the KdV equation 
through the Riccati relation. The simplest way to attempt an inversion of the Riccati relation is 
to linearize it. To that end we define 

V = (^•^) 

so that (3.1) takes the form 

U + 6X- -6i/)xx J'ip, (3-4) 

or equivalently, 

+ (~ + A)0 = 0 (3.5) 

which is, in fact, the time-independent Schroedinger equation. There exists however a more formal 
theory due to Lax [21], which we now elaborate. 

Given a linear equation described by a time-independent Hamiltonian H, and an operator A 
whose expectation values are time independent, A(t) is umtarily equivalent to A(0): 

U\t)A{t)U{t) = A(0) (3.6) 


where U(t) is the time-evolution operator with the form 

U {t) exp[—iHt] 


(3.7) 


U\t){^-i[A,Hmt)=^0 


Differentiating (3.6) gives 


(3.8) 







418 


Bani M Sodermark 


which implies 

^ = (3.9) 

Thus for the expectation value of A(t) to be time independent, the standard time evolution relation 
(3.8) must be satisfied. Further, from eq.(3.7) follows the relation 

= -iHUit) = BU{t) (3.10) 

Ou 

where 

B = -iH (3.11) 


is an anti-Hermitian operator. 

This argument is mimicked in the case of a non-linear evolution equation. Let 


L(u{x,t)) = L(t) 


(3.12) 


denote the linear operator we seek. We assume it to be Hermitian, and to have eigen-values 
independent of t. For this to be true, one must have u^t)L{t)u{t)=L{0). Differentiating both 
sides w.r.t. we obtain 

+ uHt)^Uit) + C/t(t)L(f)^ (3.13) 

Unlike the linear case, we do not know the form of U{t). However, U is unitary, so 

u^u = l^ = 0 (3.14) 

ot ot 

^ = (3-15) 


Thus we can write 


where anti-hermiticity must be imposed on B. Substitution in (3.12), and a little simplification, 
yields 

^ = (3-16) 

which is similar to (3.8), except for the fact that we do not yet know the form of B. However, let us 
assume that L{t) is linear in u(x, t). Consequently, the LHS of (3.14) is a multiplicative operator, 
proportional to the time evolution operator of u{x^t). This would ensure that the eigen- values A 
of L{t) would be time-independent, i.e., 


Further, ^(t) must be unitarily related to its value at t = 0, i.e., 

m = 

and its evolution w.r.t. time would take the form 


(3.17) 


(3.18) 


(3.19) 


The operators L{t) and jB(t), when they exist, are known as the Lax pair, corresponding to a given 
non-linear evolution equation, and play a fundamental role in determining the solution. For the 
KdV equation, L[t) is obtained from the linear form of the Schroedinger equation 



Integrahle Models and Toda Hieraxcby 


419 


By trial and error, B{t) can be chosen so that (3.15) is satisfied, and a possible solution is 

B{t) = 4.D^ + \{Du-¥uD) (3.21) 

Jd 

The solution for *0 w.r.t. t follows from (3.18) and (3.20) to be 

'ipt = + ^Ux'^ + u'ipx + const.'ip (3.22) 

which yields, using the Schroedinger equation (3.5): 

= const. 'll) (3.23) 

0 o 

A.Lenard [26], in an unpublished report, further displayed the relation between the Schroedinger 
equation and the KdV relation by elegantly deriving the latter from the former, using only the 
assumption that the spectral parameter A in (3.4) is time-independent. 

The KdV equation exhibits also a fascinating symmetry, i.e., that of the group SL(2,R). 
Consider a group element 

g = exp[i^''Ta] (3.24) 

where Ta is a generator of SL{2, R), and define 

A^=g-^d^,9 (3.25) 

Then the KdV equation follows from the fact that the Maurer-Cartan equation 

~ = 0 (3.26) 

is satisfied for a special for a special choice of gauge, e.g., 

Al = -VX] (A < 0); Al = 6; Af = Al = A{u{x,t)) (3.27) 


4 Lax Formalism On Symplectic Manifolds 

In this Section, we conclude the above study of the KdV equation with a with a short discussion 
on symplectic geometry, which is directly relevant for application to the Toda Lattice. 

A symplectic manifold is one with a preferred 2-form which is non-degenerate and closed. 
The phase space of an integrable model corresponds to a very special symplectic manifold, since 
it possesses a dual Poisson bracket structure. We assume that there exist two distinct 2-forms 
which are both non-degenerate and closed. One way of expressing the existence of two distinct 
symplectic structures is to require that the same dynamical equation be described by two distinct 
first order Lagrangians Lq and L, where 


Lo = 6'i°Hy)r-Hoiyy, 

(4.1) 

L = 9,{y)r-H{y) 

(4.2) 

where 

r = -|-; [M = l,2,...2iV] 

(4.3) 

The Euler-Lagrangian equations following from (4.1-2) are 


fnAy)y'' = 9nHo{y) 

(4.4) 

F,Ay)y'' = d,H{y) 

(4.5) 



420 


Bani M Sodermark 


W llCl u 

Uu = d^6l°Ky)-d.e<i°Hy) 

(4.6) 

= d^du{y) - djf,{y) 

(4.7) 

It is easy to see that the two forms / and are closed, where 


/ = ^Ui'dy'^ A dy’' 

(4.8) 

F = ^F^„dy>" A dy" 

(4.9) 

since and satisfy the Bianchi identities 


d" "f" 9uf\fji ~ 0; 

(4.10) 

and 

dxF^i> + duFux + duFxfj, = 0. 

(4.11) 

Besides, they must also be non-degenerate since (4.4) and (4.5) describe the 
system. Let their universes be and F^'', i.e., 

same dynamical 

Uur^ = F^,F''^=6l 

(4.12) 

so that (4.4-5) take the forms 

f = rdMv) 

(4.13) 

r = F-'i^d^Hiy). 

(4.14) 

V/e can also construct a nontrivial (1,1) tensor as 


S;=F^,{y)f^''(y). 

(4.15) 

Consistency of (4.4) and (4.5) further requires that 


d,dMy) - d,d^Ho{y) = 0 

(4.16) 

so that after a little algebra, one can show that 


^^ = -uih. + uih. 

(4.17) 

where 

u; = d^y‘' = d^[r^dxHo{y)] = d^[F''^dxH{y)] 

with a corresponding relation for F^^, i.e., 

(4.18) 

^LiM^^U^Fxu + UtFx^ 

(4.19) 

involving the same JZ-tensor. The corresponding equations for the inverses 
from (4.17) and (4,19), and have the forms 

and F^'' follow 

1 

< 

II 

(4.20) 

dF^^ 

= F''^U^ 

dt A A 

(4.21) 

We can finally show that 

— Q^JJ^ __ JT^ 

(4.22) 



IntegTs^ble Models and Toda Hierarchy 


421 


which in matrix notation 

§ = im (4.23) 

can be recognized as a Lax equation (3.15), thus providing a Lax representation of the dynamical 
equations (4.13) and (4.14). One important consequence of (4.23) is that the set of quantities 

Kn = -TrS^ (4.24) 

n 

and 

Ko = Inl detS | (4.25) 

can be shown to be invariants since 


-IT = A ^ = rr[P(5)[5. U]] = 0 (4.26) 

P{S) is a polynomial in 5. That these are in involution can easily be checked, as done explicitly 
in ref. [22]. Applied to the KdV equation, the two Poisson structures of that equation axe given by 
the correspondence: 

i}; (4.27) 

^ jqZ Du + UD) (4.28) 

o 

Going to the coordinate bases we have 


F{x, y) =< y \ 10 \ X >= d\5(x - y) (4.29) 

/(^> = + + uda:)5{x - y) (4.30) 

OX 3 

so that 

(x, y) = e(x -y) = 9(3:- y) (4.31) 
However /“^(x — y) cannot be expressed in a closed form. The Lax operator S takes the form 

S = D^ + lu + l(Du)D-\ (4.32) 

o O 

and with a little algebra, (4.23) can be shown to be reduced to the KdV equation, with consequently 
an infinite # of conserved quantities. This is described in detail in ref. [22]. 


5 The Toda Lattice 

The model of the KdV equation that has been studied so far is a continuum model. A finite 
dimensional system with a finite # of degrees of freedom is simpler to study. The Toda Lattice is 
such a system to which the symplectic approach of the above Section is especially applicable. We 
now study the Toda Lattice and its integrability from a symplectic point of view, following it up 
with a group theoretical treatment. 

The Toda Lattice describes the motion of N point masses on the line, under the influence of 
an exponential interaction. The Hamiltonian equations in terms of the canonical' coordinates Qi 
and momenta Pi are given by 

Qi = Pi-, (i = l,2,....iV); 

Pj. = (j = 2, 3, .N - 1); 

Pi = _ g-(<3iv-QN-i)_ 


(5.1) 



422 


Bani M Sodermark 


The equations can be cast into a more symmetrical form by enlarging the system to {N -f 2) point 
masses, with end points at spatial infinity. In that case, the Hamiltonian equations take the form : 



Qi = Pi-, (i = 1,2, ....AT); Pi = 

(5.2) 

We can choose 

f = Qi; y^^'=Pi-, ii = i,2,..N). 

(5.3) 

Applying the geometrical method of the previous Section, two choices of the Lagrangian are as 
follows: 

TV .. , 

Lo = Y^l^iPiQi - QiPi) - (5.4) 

i=l 


L = J2^^{Pi + + MP)Pi] - H{Q, P) 

i— 1 

(5.5) 

where 

n{P) = l'Ze{i-j)Pf, 

j-l 

(5.6) 


H{Q,P) = + {Pi + 

(5.7) 

turns out to have the canonical Poisson bracket structure 



/.-(? -oO 

(5.8) 

so that 

11 

1 o 

(5.9) 

Ffj,p can be shown to have the form [22] 



1 ) 

(5.10) 


where 


Bij = PiSiji eij = €(j — i) 

The (1, 1) tensor thus takes the form 



and the conserved quantities are 


N 

TrS=2TrB = 2^Pil 
1=1 


N 


P? 


-TrS^ = Tr[2B^ - (Ae + eA)] = = Ho{Q,Py, 


1=1 


(5.11) 


(5.12) 


(5.13) 


(5.14) 



Integrable Models and Toda Hierarchy 


423 


The Lax representation (4.23) for the Toda equation takes the form of the following matrix equa- 
tions 

dA 

(5.16) 


dt 

^-§ = A-De = \[e.D) 


(5.17) 


which reduce to the Toda equations Qi = Pi and Pi = e~^^' _ g (Qi+i Qi) respectively. 

5.1 Group Structure of Toda Equations 

Eq.(5.1) can be differentiated and put in the fonn 


Qi 

Qi 

Qn 


Pi = e-iQi-Qi-i) _ g-iQi+i-Qi) 
= e-(QN-<3w-i) 


(5.18) 


It is easily checked that 


N 


N 


i=l i=l 

i.e.j the total momentum is conserved, and therefore the centre of mass motion can be separated 
and the dynamics of the system expressed in terms of {N - 1) coordinates and momenta. Defining 


Qa — Qa-\-l ^ l,2...i\r 1, 

the second order equations satisfied by the qaS can be written as 

g'l = 26-^^ - 

^ 2e"^“ - a=l,..iV-l 


(5.20) 

( 5 . 21 ) 


which can be compactly written as 


iV-l 
Qa “ ^ ^ Hab^ 


-Qb 


(5.22) 


6=1 


Kab being the Cartan matrix for SU{N), Eq.(5-22) generalizes for the other Lie Algebras as well. 
The Lagrangian giving rise to the above Euler-Lagrangian equations can be written as 

N N ^ N 

a=l 6=1 ii=l 

Kab~^ being the inverse of the Cartan matrix. The momenta conjugate to qa are defined as 

dL 


N-l 


Va = ^ = y:Kab-% 


(5.24) 

(5.25) 


and it is easily checked that 

{9a, Pi} = <^06 

so that {go, Pa) constitute a canonical coordinate system. 

That the group structure entering above is not just accidental, can be seen by defining the 
following Lax operators: 


5 = 1 yy>aEa + {Ea + il-o)e-«“/']; 


0=1 


(5.26) 



424 


Bani M Sodermark 


^ = -5 E - B-a] (5.27) 

a=l 

where Ha and Ea are the generators of SU{N) in the Chevally basis. 

The Lax equation (4.23) can be seen to be satisfied, since ^ - [5, U] reduces to 

.. N-l N-1 

i ^ HaKar% - E 

a, 6=1 c=l 

which is zero by virtue of the Toda equations (5.22). Hence the quantities 

Kn = -TrS^ (5.28) 

n 

must be conserved under the flow of the Toda equations. Since 5 belongs to the SU{N) algebra, 
the number of independent conserved quantities can equal (A/' — 1), which is the rank of SU(N). 
The total number of conserved quantities is thus N, if we add the total momentum. It can be 
shown that these are also in involution [22]. This treatment is due to Leznov and Saveliev [18]. 

6 Zakharov-Shabat Formalism 

So far we have only studied two integrable models, viz., the continuum KdV and the finite di- 
mensional Toda Lattice. In trying to understand the non-linear Schroedinger equation which is 
also integrable, Zakharov and Shabat [23] obtained a description which was later generalized by 
AKNS [24] to describe various other integrable models. This approach uses a Lax operator which 
is first order in the derivative dx, in contrast to the second order formalism in eq.(3.19). Besides 
describing various integrable models in a unified manner, this approach has the additional advan- 
tage that the inverse scattering method generalizes readily to the quantum case. In what follows, 
we describe the first order formulation of the Lax operator, and elucidate the essential features of 
this approach. It is easily checked that if 



L(t)i^(t) = -Xipit); 

(6.1) 


dtL{t) = [Bit),L{t)], 

(6.2) 

where 


(6.3) 

then 

dt 

(6.4) 


We can invert the argument to identify the Lax pair in the following way. Namely, if 


= -Xtpity, = 0 

, i.e., if the compatibility condition of (6.1) and (6.3) yield the system under study, then L(t) and 
B{t) can be identified as the Lax pair of the system. We would like L{t) to be linear in dx- Using 
the analogy between the Klein-Gordon and Dirac equations, we define a two-component column 
matrix 

and generalize the two equations (6.1) and (6.3) to first order matrix equations as 


dx 

dt 


{qa+ + ra.ipa 3 )<j>-, 
{Pa+Qa+Raa)^ 


( 6 . 6 ) 



Integrable Models and Toda Hierarchy 


425 


where a± and az are the Pauli spin matrices. The dynamical variables q{Xjt) and r{x,t) do not 
depend on the spectral parameter p which is assumed to be independent of x and t. The coefficient 
functions P, Q and R on the other hand, do depend on p, and are functionals of q and r. Demanding 
that the partial derivatives of w.r.t. x and t commute, we obtain the compatibility conditions to 
be 

-^=qQ-rP-, (6.7) 


dr 


dx 


— 2rR - 2ipQ] 


(6.8) 


^^=—^2qR + 2ipP (6.9) 

i.e., if (6.7-9) describe the non-linear evolution of a system, then (6.6) describes the Lax pair 
appropriate for such a system. Explicitly 


L = dx- qa+ — ra- (6.10) 

B = Pa^ “f Qa^ 4- Raz (6.11) 

so that (6.2) is satisfied. 

The choice of r = 6 yields the KdV equation, and the choice r — q = -ivf\/6, the MKdV 
equation. The choice q = y/k'ij;* and r = Vkrp, k being an arbitrary constant parameter, yields the 
non-linear Schroedinger equation; 

idtxjj = -'ipxx 2k\ 'll; ftp ( 6 . 12 ) 


and the choice r = -q= with 

P z=:Q = — sincj 

4p 

yields the sine-Gordon equation. 

The operator (L + A) in (6.1) can be rewritten as v{x, t, X) +dx, where 

V = —q(T+ - r<Tz + ipcTs (6.13) 

If one knows the solution of the associated Schroedinger equation at some other point (i, t) by 
multiplying the solution by a hennitian matrix T{x,y,t,X), i.e., 

■tl){x, t, A) = T{x, y, t, Xj-ipiy, t, A) (6.14) 

where T{x, y, t, A) is a solution of 

dxT{x,y,t,X) = -{qa-+ -ra- ip<Tz)T{x,y,t,X) (6.15) 

with the initial condition T{x, x,t,X) = I. 

7 The Zero Curvature Condition 


The Lax condition (6.2) can be written as 

[dt-B,L] = 0 


Using 


L = dx-A{x) 


(7.1) 

(7.2) 


we obtain the form 


[(at-B),(0,-A)]=o 


(7.3) 



426 


Bani M Sodermaxk 


which is like a zero-curvature condition for 


Foi = [(9o-M(5i-Ai)] (7.4) 


with the identification 

Ao = -B(x, p); Ai = -A{x, p) (7.5) 

The importance of the zero curvature condition stems from the fact that (6.6) may be solved, using 


ip{x) =T{x,y,p)tp{y) 

(7.6) 

where the transformation ^ 

T(x,y,p)= Pr exp[- f Ai {z)dz] 

Jy 

(7.7) 

where Pr denotes path ordering. 

It is easy to see that T{x,y^p) translates solutions of the problem along the 

rr-axis for a fixed 

time, i.e., 

T(x,y,p)T(y,z,p) = T{x,z,p); 

(7.8) 

T~'^{x,y,p) =T{y,x,py, 

(7.9) 

T{x,x,p) = 1 

(7.10) 

Setting 

rX2,t2 

Urix2,t2-,Xi,ti) = Prexp[- Afj,dx'^] 

(7.11) 

J Xi ,ti 

and taking the product of two such exponents, it is easy to see that 

Ur,{x2,t2-,Xi,ti)Ur2{Xi,ti;X2,t2) = exp[-i 

(7.12) 


using the Baker-Campbell-Hausdorff formula and the Stokes theorem, the integration being done 
over the area enclosed by the closed path ri + r 2 . As the curvature vanishes, 


Ur:,ix2,t2;Xuti)Ur2{xi,ti;X2,t2) = 1 (7.13) 

and so 

=Ur{Xi,ti-,X2,t2) (7.14) 

so that 

Uri{x2jt2\ Xiyti) = Ur2{x2jt2]Xi^ti) (7.15) 

ergo, U is independent of the path taken. For a closed path, U (x, t; x, t) = 1. Hence path ordering 
drops out of the transition matrix T(x,y,p). 

Returning to the time evolution of the transition matrix, it can be shown that 

dtT{x,y,p) = [B{x,p),T{x,y,p)] (7.16) 

which is the form of a Lax equation, so that all quantities of the form 

ir„ = ^Tr[Tip)]^-, Kq = ln[deiT(p)] (7.17) 

are conserved. We thus have an infinite number of conserved quantities when the zero curvature 
conditions are fulfilled. That this holds also for Toda Field Theories was shown by Olive and Turok 
[16]. 



lategrahle Models and Toda Hierarchy 


427 


8 From Conformal Invariance To Toda Field Theory 

That the KdV equation has a hidden conformal symmetry can be seen by making a Fourier 
expansion with Fourier coefficients 


1 r27r j 

u„ = --y^ uix)e-^--^5no (8.1) 

It can be shown that the Poisson brackets of the Un satisfy the Virasoro Algebra (up to trivial 
factors), i.e., 

-2i7r{un,Um} = “(n - Tn)um+n + - l)5n+m (8.2) 

Higher order terms in the KdV hierarchy have a hidden lo symmetry. 

We now digress to take a look at Toda Field Theories. These are essentially the only class of 
integrable, interacting, conformally invariant field theories in two space-time dimensions. To see 
this, we start with the generic action 

s = I - V(<l>i)]d^z (8.3) 

The trace of the naive conserved energy-momentum tensor becomes 

Tj^ = 2V. (8.4) 

As the trace of the energy-momentum tensor is required to vanish in a conformally invariant 
theory, it seems that if F 7 ^ 0, the theory is not conformal. However there is an ambiguity in the 
definition of the energy-momentum tensor. If we attempt to improve the naive energy-momentum 
tensor without violating the conservation property, we could choose 

S/JV = T^v + [d^du - (8.5) 


whence the trace of the modified energy-momentum tensor is 


e^^ = 2V + d+d-f 


( 8 . 6 ) 


± being the light cone directions. If the second term is to cancel the first, we somehow need to 
get rid of the derivatives. This can be done, using the equations of motion. Without knowing the 
explicit equations of motion, the most general expression for f((pi) is Using the equations 

of motion resulting from varying the action , the tracelessness condition becomes 


2^^ + E 

i 


dV 

^d4>i 


= 0 


(8.7) 


Eq.(8,7) is easily solved, with the result that the trace of the energy-momentum tensor vanishes if 
the potential is of the form 

= djexp['^ bij4>ji] , (8.8) 

3 

satisfying the requirement 

^Ci 6 y = -2 (8.9) 

t 

We choose to be related to the Cartan matrix of a simple Lie Algebra. The resulting field 
theories are called Toda Field Theories, and are described by the action 

Sxoda = J ^ ^ 


( 8 . 10 ) 



428 


Bani M Sodermark 


where <, > is the scalar product in the root space, and (f> takes its values in the root space of the 
simple Lie Algebra on hand. 

The equations of motion obtained from (8.10) are 

0d>^d>^4>i + mhxp Kij4>j) = 0 (8.11) 

Specializing for the SU{n) group, and setting m = jS = 1, this becomes 


dj^d-((>i = -exp {Kij(j>j) (8.12) 

With (t>o = 0 and = 0, this reduces to 

d^d^(j>i = -exp {2<i>i - - (j)i+i) (8.13) 

Setting 

0t— l) (0i4-l ” 

after Mikhailov [27], we get the equation 

a* Vi - axVi = -[2e^‘ - (8.14) 

which is easily seen to be related to the Toda equations (5.20). One expects that the Toda Field 
Theories are integrable, and it turns out that they are indeed so (see ref. [8]). The calculation rests 
upon the existence of a zero curvature condition for certain group theoretical combinations of (j>, 
which can be chosen as gauge fields. 

As mentioned earlier, the Toda Field Theories have been completely solved for simple g by 
Leonov and Saveliev [18]. They have also been solved for affine g by Olive and Turok [16]. 

Quantization of the Toda Field Theories is more problematic since the potential has no local 
minimum, the latter being attained at infinity, using the gauge group Ai. A lucid discussion of 
the problems encountered in the theory is given in ref. [29]. 

The central charge of the Toda theories can be constructed using free field technology, and is 
found to be [30] 

r being the rank of the algebra, p being half the sum of the positive roots, and its dual. Eq.(8.15) 
gives an indication that a quantum Toda theory with a strong coupling constant is equivalent to 
another Toda theory with a weak coupling constant, obtained by replacing ^ by Air/hp, and 
interchanging roots and ” coroots” . 

Incidentally, strong/weak coupMng duality has recently become a subject of immense study in 
relation to string theories. 

It is possible to obtain the minimal models from the Toda Field Theories. For a particular value 
of /?, the central charges can be made to agree. However this is not enough. A complication arises 
from the fact that not all primary fields in the minimal models are actually present in the Toda 
theory. However, because of the duality in the theory, we can add another part of the potential 
with the coupling constant replaced by its dual; see Mansfield [31]. This modification is sufficient 
to give complete agreement, 

9 W-Algebras: Hamiltonian Reduction of WZNW^ 

Another fact which makes the conformally invariant Toda theories interest ing is that to each 
such Toda theory, there corresponds a W^-algebra. The V7-algebras are an extension of the Vira- 
soro algebra by adding primary fields primary fields of spin higher than Z, and were introduced 
by Zamolodchikov [32] as a pointer to conformal field theories with a larger overall symmetry. 
Zamolodchikov [32] investigated the case in which a primary field w{r) of weight 3 is added to the 



Integrahle Models and Toda Hierarchy 


429 


Virasoro algebra. In order for the algebra to be close, it had to be made ‘non-linear’, and hence 
lost its linear Lie Algebra character. 

Balog et al [33-35] showed that the Liouville and Toda Field Theories can be obtained as 
conformally reduced WZNW theories. This reduction can be viewed as a gauge procedure, and 
the Toda field theory can be obtained as the gauge invariant content of a gauged WZNW theory. 
The Liouville theory is obtained for the special case of the SL{2, R) gauge group. 

The most powerful method of constructing W'-algebras is through the so-called quantum 
Drinfield-Sokolov reduction. In this, one starts with an affine Lie Algebra, and reduces it by 
imposing some constraint on its generators. At the classical level, this procedure which leads to 
the so-called Gelfand-Dickey algebras [36], was pioneered by Drinfield and Sokolov [37]. 

It is thus clear that under the reduction that takes a WZNW field theory to a Toda field 
theory, the affine Lie Algebra that characterizes the WZNW theory reduces to a PF-algebra that is 
associated to a Toda field theory. This approach is also readily generalizable to the supersymmetric 
case where various new PF-superlagebras have been found as symmetry algebras of supersymmetric 
Toda field theories. We refer the interested reader to ref. [38] for further progress in this area. 

In what follows, we review the essential steps of the Lagrangian reduction of the WZNW 
model. The WZNW action for a non-compact group G in 2D Minkowski space-time is 

S(9) = d?iyn>^''Tr{g-^d^g){g-^d.g) + ^ Tr{g-^dgf (9.1) 

where B is the volume occupied by 5^. The left and right Afiine Kac-Moody [AKM] symmetries 
of this theory are generated by the Noether currents 

J(A) = «Tr[A(d+^)r']; = -KTvlXg-^^id^g)] (9.2) 

where « = and A is an element of the Lie Algebra g. The WZNW equations of motion are 
known to be equivalent to the current conservation 

a>J = d+J = 0. (9.3) 

We now choose the following Gauss decomposition of an arbitrary element g=^ABC^ e.g., 

A = exp[ ^ (9.4) 

a€A+ 

B = exp[|(2] r-fTa)]; 

a€A 

C = exp[ ^ y“£^a]; 

aGA- 

where Cartan-Weyl root vectors Ea, Cartan subalgebra generators Ha - [Ba,S-ah and a set of 
positive (negative) roots have been introduced with the following properties 

Ka/s = aiHp) = aJeA; | ajonj 1^ = 2; . (9.5) 

I « 1 

Tr{HaHfi) = ■ — — ^010] 

|a| 

TriEaEg) = Tr[E^,H0] = 0. (9-7) 

i “i 

We also introduce the Polyakov- Wiegmann identity 

S{ABC) = S{A) + SiB) + S(C)+Kj(fpTr[{A~^d-A)d+B)B~^ (9.8) 

+{B-^d-B)id+C)C-'^ + {A-^d-A)(B(d+C)C-^B-^)] 



430 


Bani M Sodermask 


We now see, using eqs.(9.4-9.8), that the generalized constraints 

J{Ec) = ACC?; J{E-cc = -ACC?; a 6 A+ (9.9) 

with some real numbers c? j whose values do not vanish only for primitive roots a G A, are enough 
to reduce the G- based WZNW theory to the Toda Field Theory defined by the Lagrangian 

Lroda = (9-10) 

qcGA 


where 

(u2)“ = I a |'c?c? 

. Due to Cl 2 ^ 0 for tho primitive roots, the constraint (9.7) can be re-written in terms of the 
Gauss decomposition (9.5-7) as follows: 


A-^^d^A 


B['£hafciEa]B-^ 

o^£A ^ 


(9.11) 


(d+C)C-^ = 


In the WZNW equations of motion, A and C occur only in the combinations given in (9.11-12), 
so that they can be eliminated in favour of B or (j>^. The remaining equation is just the Toda 
equation [25,34,35]: 

d+d.4>'^ + i| a |"(n“)2 = 0; (9.13) 

(see also ref. [25] for details). 

As mentioned earlier, the Toda Field Theory possesses an extended symmetry represented by 
a classical IF-algebra. These W’-algebras can be obtained as the quantum versions of the so-called 
Gelfand-Dickey algebras [36] known in the theory of KdV equations. For instance, the Poisson 
bracket associated with the KdV equation in (8.2), results in the classical version of the Virasoro 
algebra which is the simplest W-algebra. Moreover, the Lax representation of the KdV equation 
(3.15), defines the third order differential operator B = The Fourier components of B, along 
with those of the KdV field, form the Gelfand-Dickey [36] algebra that generalizes to in the 
quantum case. 

Now regarding the Toda theory as a constrained WZNW theory, the Hamiltonian structure 
can be obtained by a classical Drinfield-Sokolov reduction from the constrained phase space of the 
AKM algebra. In the Hamiltonian formalism, the AKM symmetry of the WZNW theory is rep- 
resented by first class constraints. The W-algebra of the Toda theory arises as the Poisson bracket 
algebra of gauge-invariant polynomials of the constrained AKM currents and their derivatives. In 
what follows, we summarize the arguments supporting these statements. 

Let g{z,z) be the G-valued WZNW fields and J{z) the corresponding AKM currents having 
the form 

g{z, z) = 9 {z)g(z); dg{z) = J{z)g{z) (9.14) 

Let dimg be the dimension of G\ I its rank; k the level of the associated AKM algebra g] g the 
dual Coxeter number of G] p the half sum of the positive roots; and /? the dual of p. 


(9.12) 

aeA 



Integrable Models and Toda Hierarchy 


431 


The constrained WZNW theory is specified by (9.9). After a suitable choice of constants Cf, 
the currents J (z) can be decomposed as 


i 

J{z) = /-+y(z); (9.15) 

i=l 

I 

3{z) = 

i=l ^€A+ 

where {Eq. } are I simple roots of g. The maximal subgroup of G leaving this form of currents 
invariant, is the maximal nil-potent subgroup generated by E(j>, {(j> € A"^), and implemented by 
the {dimg ~ l)J 2 constrained AKM currents J^(z). This allows us to interpret the constrained 
WZNW theory as the gauge theory in which all but I of the [dimg + 1)1 2 components of J are 
gauge components [33-35]. 

The current j{z) and the gauge transformations corresponding to E^p act on each column of 
the WZNW field g{z) separately, while each column contains only one gauge-invariant component 
e (of the highest weight), satisfying E^e = 0. The gauge degrees of freedom corresponding to 
the other elements of each column can be eliminated by a gauge fixing in favour of e. Because 
of (9.15), this leads to a linear pseudo-differential equation De = 0, where D is a polynomial 
pseudo-differential operator whose coefficients are gauge invariant polynomials in the currents J. 
This operator D can now be used to define a classical W'-algebra by choosing a Drinfield-Sobolov 
gauge in which one has 

3ds = ^ w^{z)Fp (9.16) 

where P’s are the orders of I independent Casimir operators of g, and Fp generators with H 
weights (P — 1), so that the gauge-fixed current (9.16) has only one non- vanishing component in 
each of the I irreducible representations in a decomposition of the adjoint of g w.r.t. one of its 
sub-groups 5L(2, R). The Poisson brackets between the different polynomials define a classical 
W’-algebra. 

We close this Section by noting that Toda field theories also play an important role in the 
discussion of W-gravity, where they arise as effective quantum theories [39,40] for the W-gravity 
degrees of freedom in the conformal gauge. For a quantum version of the WZNW Toda 
conformal reduction, see [34, 41]. 


10 Self-Dual Y-M Theories: 2D Integrable Models 

The self-dual Yang-Mills (SDYM) theory appears to be a master theory for a whole variety of 2D 
integrable systems, as we are now going to explain. Though there is no general proof, the statement 
can be checked on a case by case basis. The main point is that the 4D self-duality condition admits 
of a zero curvature representation underlying a Hamiltonian description of SDYM descendents in 
lower dimensions. This makes it possible to apply the inverse scattering method for integration 
of the SDYM equations. Simultaneously, it explains the origin of gauge symmetries in integrable 
systems of the KdV type, since the SDYM theory in both gauge and conformally invariant in 4D. 
And last but not least, this connection provides us with a systematic way to associate the KdV 
type hierarchy with any simple Lie Algebra. 

SDYM solutions invariant by the action of a subgroup with two conformal generators satisfy a 
2D differential equation, since eacd ID subgroup reduced the number of independent variables by 
one. This allows us to describe the invariant SDYM solutions in terms of a 2D integrable system. 
All known 2D integrable systems seem to be derivable this way, by appropriate truncations of a 
4D self-dual gauge theory. This is true, in particular, for the KdV and non-linear Schroedinger 
equations, the Liouville and Toda equations, as well as other integrable in 2 and 3 dimensions. 
Our presentation in this Section is only illustrative; we give one explicit example of embedding of 
the KdV equation into the 4D SDYM theory [42], and a supersummetric generalization. 



432 


BanI M Sodermark 


Let = (a:, y, z, t) be the coordinates of a flat 4D space-time of signature (+, -h, — , —). The 
invariant metric reads 

ds^ = 2dxdz - 2dydi (10.1) 

The SDYM equations in 24*2 dimensions {e^yzi = 1) read as 

Fab = \eabcdF^‘^ (10.2) 

and are equivalently represented by 3 equations of the form 

Ftx = Fyz = Fty -f Fzz = 0 (10*3) 

After a dimensional reduction which is equivalent to setting 


(10.3) takes the form 


dy = d2-dx=^ 0 , 


[dt - H,dx - Q] = [P,B] = 0; [H,B] = [d^ - Q,9x - P] 


(10.4) 

(10.5) 


where 

At = H; Ax-Q] Ay = -B\ Az = P 

It is clear that the first equation in (10.5) is a zero curvature equation. We now choose the 
non-compact group and an embedding pattern in the form ' 



( 10 . 6 ) 


«=(-\ -\) (“■') 

where A is a constant and u = u{tyX -I- z). We can expand the Lie Algebra- valued fields H and P 
as 

H = 4* + B-zTs] P = P-.T+ 4- P+T- 4* PzTz (10.8) 

where r± = (ri ±ir 2 )/ 2 , and ri, 2,3 are the Pauli spin matrices. It is clear that the second equation 
of (10.5) gives 

= P 3 = 0, (10.9) 

while the third equation of (10.5) gives 

H- = -P+; Hs = -^da(u + P+) - XP+ (10.10) 

where primes denote derivatives w.r.t. x. 

Finally, the first equation of (10.5) yields 3 equations 

H+ = uP+ - \dxP+ -^dxdx{u + P+y, 9j:(ai + 2P+) = 0; ( 10 . 11 ) 

u = ^dxdxdx{u + P+) + (u-P+)dxU + 2X^P+ 

It follows that 

r, 1 rr 1 2 ^ 1 , x 

P 4 2 ^’ 2 ^ 4 ^®®’ ( 10 . 12 ) 

1 3 ,, 

U — ^ *^xxx d" 2 A tlx 



Integrable Models and Toda Hierarchy 


433 


Changing the notation as 

2 

u u -h -A^; t 4t; x -hy x, 
o 

one obtains the KdV equation 

Uf = '^xxx "h bllUx- 

This example may be relevant towards an ultimate unification of 2D integrable models and 2D 
conformal field theories, as well as within the 4D SDYM theories which are also closely related to 
N + 2 strings. 


10.1 Self-Duality and Supersymmetry 

Extended Supersymmetry is compatible with self-duality in 2-1-2 dimensions. Therefore the Su- 
persymmetric self-dual Yang-Mills theory (SSDYM) is capable of generating Supersymmetric 2D 
integrable models. However a Supersymmetric generalization of the SDYM theory is not unique. 
One could either replace a gauge group by its graded version, or a 2-f2 dimensional space-time by 
superspace. 

Supersymmetric generalizations of the KdV equation in 1+1 dimensions were obtained inde- 
pendently by Manin and Radul [43], Mathieu [44], Bilal and Gervais [45].* These equations have 
two dynamical variables, one bosonic u{x,t), and one fermionic and read 

1 3 

dtU = -Uxxx + Zudxu + (10.13) 

1 3 

dtXjj = 2^2:a;a; 

They are invariant under the iV = 1 Supersymmetry transformations 

Su = edx'ip] StP = eu (10.14) 


€ being a constant Grassmann parameter. Eqs.(10.13) are integrable, and can be obtained from 
the zero curvature condition associated with the graded Lie Algebra osp(2, 1) 

dtAx - dxAt + [AuAx] = 0 (10.15) 


when the following ansatz is used for 2D Yang-Mills potentials [45]: 

( Ux Uxx + \ 

— 2u -- Ux ii^x 1 (10.16) 

i'^x d” 2iwip 0 J 

( 0 u —iip \ 

-10 0 (10.17) 

0 0 / 

The 2D Super KdV can be embedded into the self-duality equations by choosing the osp(2/l)- 
valued matrices H,Q as H = At{x,t), Q ^ Ax{x,t), and B,P as 3 x 3 matrices as 


'0 I 0^ 
B= \ 0 0 0 
0 0 0 . 


(10.18) 


/O f 

P = 0 0 0 (10.19) 

\0 0 J 

It can also be shown that the iV = 1 and iV = 2 Super KdV equations, as well as the iV = 1 
Super-Liouville and Super-Toda equations, can all be obtained from the N = 2 SSDYM theory by 
dimensional reductions and truncations [46]. A detailed analysis is however outside the scope of 
this Article. 



434 


Bani M Sodermaik 


11 Conclusions 

It has been our aim to present a bird’s eye view of the important developments in Integrable 
Systems over the past few decades. What has been achieved is possibly a more subjective viewpoint, 
related to building connections between sundry topics of immediate interest. It has certainly not 
been possible to delve more deeply into the fascinating developments in afhne Toda Field Theory 
which seems to be a thrust area of research today. We refer to the excellent lecture series by 
Corrigan [48] on this subject. Neither is it possible to present an account of the interesting link 
between the KdV theory and Matrix models, encompassing thus 2D gravity; (see ref. [25] for a 
readable account). Supersymmetric Toda Field Theories have also been given the go by. They were 
first studied by Evans and Hollowood [49], as well as by Leites et al [50]. It seems to be possible 
to construct Toda Field Theories based on Lie Superalgebras with one proviso, namely, that it is 
necessary that the Lie Superalgebra admits a purely fermionic root system. This is only possible 
for the following algebras: 

A{n^n — l)]B(n,n)]B{n — l,n);i>(n.n — l);D(n,n);andjD(2,l,a). 

In the generic case, N = 1 Supersymmetric theories are obtained, which can be formulated in 
TV = 1 superspace. There is one special case, namely, the sZ(n,n — 1) theories have in fact N = 2 
Supersymmetry; see [49]. Recently, Brink and Vasiliev [51] have proposed a model generalizing 
Am Toda Field Theories based on a continuous parameter, such that when this parameter takes 
on certain discrete values, the model reduces to the ordinary Am Toda Theories. More recently, 
Wyllard [52] has worked out a WZNW reduction of these generalized theories, and has also 
attempted a Supersymmetric generalization [53] of the same. 

One could also picture the affine Toda theories as integrable deformations of the conformal 
Toda theories. As an example, by adding an extra simple root to the Ai Toda theory, one obtains 
the affine Toda theory, which is also the sinh-Gordon theory. General integrable deformations 
have been investigated by Zamolodchikov, among others, as an interesting field. Toda theories 
also appear in many other diverse areas of theoretical physics, e.g., ID discrete versions appear in 
the physics of monopoles [54]. Further, certain 3D continuous Toda systems are relevant to the 
classification of hyper-Kahler metrics in 4D [55]. Finally, it also appears that Toda Field Theories 
are relevant to M-Theory, the conjectured alLin-all Theory encompassing all String Theories [56]. 

I am deeply grateful to Prof. S.K. Malik, for providing me an opportunity to write this topical 
Article on the subject of Toda Field Theories. The literature on this subject is quite vast, but 
apart from my own interest in Lie Algebras and super-algebras (which often draws me into this area 
because of their obvious relevance to this subject), I have benefitted greatly from some concentrated 
literature [22,25,56,57]. I am especially grateful to Prof. Ashok Das for his encouragement, and 
for sparing time to go through the manuscript. I would also like to thank Jens Fjelstad for help 
with LaTeX. 


References 

[1] J.Liouville, J.Math Puris et Applique’es 18, 71 (1853). 

[2] A.M.Polyakov, Phys. Lett.BlOS, 207 (1981). 

[3] J.L.Gervais and A.Neveu, Nucl.Phys.B199, 59 (1982). 

[4] E.D’Hoker and R.Jackiw, Phys.Rev.D26, 3517 (1982). 

[5] T.L.Curtweight and C.B.Thorn, Phys.Rev.Lett.48, 1309 (1982). 

[6] D.J.Korteweg and G.de Vries, Phil.Mag.39, 422 (1895). 

[7] C.S.Gardner, J.M.Greene, J.M.KruskaJ, R.M.Miura, Phys.Rev.Lett.l9, 1095 (1967). 

[8] R.M.Miura, J.Math.Phys.9, 1202 (1968). 



Integrable Models and Toda Hierarchy 


435 


[9] R.M.Miura, C.S. Gardner, J.M.Kruskal, J.Math.Phys.9, 1204 (1968). 

[10] C.H.Su, C.S.Gardner, J.Math.Phys.lO, 536 (1969). 

[11] C.S.Gardner, J.Math. Phys.l2, 1548 (1971). 

[12] J.M.Kruskal et al, J.Math.Phys.ll, 952 (1970). 

[13] C.S.Gardner et al, Comm Pure And Appl Maths 27, 97 (1974). 

[14] M.Toda, J.Phys.Soc.Of Japan 22, 431 (1967). 

[15] R.Hirota, K.Suzuki, J.Phys.Soc.Of Japan 28, 1336 (1970). 

[16] D.Olive, M.Turok, Nucl.Phys.B257, 277 (1985). 

[17] T.Hollowood, Nucl.Phys.B384, 523 (1992). 

[18] A.N.Leznov, M.V.Saveliev, Lett.Math.Phys.3, 489 (1979). 

[19] T.S.Hollowood, P.Mansfield, Phys.Lett.B226, 73 (1989). 

[20] LM.Gelfand, B.M.Levitan, Trans.Amer.Math.Soc.l, 253 (1951). 

[21] P.D.Lax, Comm Pure And Appl Maths 21, 467 (1968). 

[22] Ashok Das, Integrable Models, World Scientific (1989). 

[23] V.E.Zakharov, A.B.Shabat, Soviet Phys. JETP 34, 62 (1972). 

[24] M.J.Ablowitz et al, Phys.Rev.Lett.30, 1262 (1973); 31, 125 (1973). 

[25] S.Ketov, Conformal Field Theory, World Scientific (1995). 

[26] A.Lenard (Unpublished); reported in Ref.[l]. 

[27] A.V.Mikhailov, Soviet Phys. JETP Lett.30, 414 (1979). 

[28] A.N.Leznov, M.Saveliev, Comm Math Phys.74, 111 (1980). 

[29] Y.Karama, H.Nicolai, Int J of Mod Phys.A9, 667 (1994). 

[30] P.Mansfield, Nucl.Phys.B222, 419 (1983). 

[31] P.Mansfield, Phys.Lett.B242, 387 (1990). 

[32] A.B.Zamolodchikov, Theor. Math. Phys. 65, 1205 (1986). 

[33] J.Balog et al, Phys.Lett.B227, 214 (1989). 

[34] L. Feher et al, Phys.Rep.222C, 1 (1992). 

[35] P.Forgacs et al, Phys.Lett.B237, 214 (1989). 

[36] L.Gelfand And L.A.Dikin, in Gelfand, Collected Papers, Ed by Gindinkin et al, Springer 
Verlag, N.Y. (1987); p625. 

[37] IDrinfield, A.Sokoiov, J Sov Math.30, 1975 (1984). 

[38] P.Bouwknegt and K.Schoutens, Phys.Rep.223C, 183 (1993). 

[39] E.BergshoefF, A.Bilal, K.S.Stelle, Int J Mod Phys.Afi, 4951 (1991). 

[40] K.Schoutens, A.Sevrin, P.van Niewenhuizen, Nucl.Phys.B349, 791 (1991). 



436 


Bani M Sodermark 


[41] L.O.Raifeartaigh, P.Ruelle, Ltsutsui, Phys.Lett.B258, 359 (1991). 

[42] L.J.Mason, G.A.J. Sparling, Phys.Lett.A137, 29 1989). 

[43] Yu.L Manin and A.O.Radul, Comm. Math. Phys.98, 67 (1985). 

[44] P.Mathieu, Phys.Lett.B203, 287 (1988). 

[45] A.Bilal, J.L.Gervais, Phys.Lett.B211, 85 (1988). 

[46] A.Da3, C.A.P.Galvao, Mod.Phys.Lett.AS, 1399 (1993). 

[47] S.J.Gates, H.Nishino, Phys.Lett.B299, 255 (1993). 

[48] E.Corrigan LANL hep-th/9412213, December 1994. 

[49] J.Evans, T.Hollowood, Nucl.Phys.B352, 723 (1991). 

[50] D.A.Leites, M.V.Saveliev And V.V.Sergonov, in 
Group Theoretical Methods In Physics^ Vol.l, 
eds M.A.Markov et al, VNU Science Press, 1986. 

[51] L.Brink, M.Vasiliev, Nucl.Phys.B459, 273 (1995). 

[52] N.Wyllard, Mod.Phys.Lett.A12, 95 (1997). 

[53] N.Wyllard, NucLPhys.B491, 461 (1997). 

[54] H.Lu, C.N.Pope, K.W.Xu, LANL hep-th/9604058, April 1996. 

[55] I.Bakas, K.Sfestos, LANL hep-th/9604003, April 1996. 

[56] N.Wyllard, Ph.D Thesis (May 1998), Chalmers U; Goteborg(1998). 

[57] E.G.B.Hohler, Ph.D thesis, Trondheim Univ.Press (1995) 



17. Perspectives Of Light-Front Quantum Field 
Theory: Some New Results*^ 

Prem P. Srivastava^ 

Stanford Linear Accelerator Center^ Stanford University^ Stanford^ 

CA 94309, USA 


Abstract 

A review of some basic topics in the light-front (LF) quantization of relativistic field the- 
ory is made. It is argued that the LF quantization is equally appropriate as the conventional 
one and that they lead, assuming the microcausality principle, to the same physical content. 
This is confirmed in the studies on the LF of the spontaneous symmetry breaking (SSB), of 
the degenerate vacua in Schwinger model (SM) and Chiral SM (CSM), of the chiral boson 
theory, and of the QCD in covariant gauges among others. The discussion on the LF is more 
economical and more transparent than that found in the conventional equal-time quantized 
theory. The removal of the constraints on the LF phase space by following the Dirac method, 
in fact, results in a substantially reduced number of independent dynamical variables. Conse- 
quently, the descriptions of the physical Hilbert space and the vacuum structure, for example, 
become more tractable. In the context of the Dyson-Wick perturbation theory the relevant 
popagators in the front form theory are causal. The Wick rotation can then be performed to 
employ the Euclidean space integrals in momentum space. The lack of manifest covariance 
becomes tractable, and still more so if we employ, as discussed in the text, the Fourier trans- 
form of the fermionic field based on a special construction of the LF spinor. The fact that 
the hyperplanes = 0 constitute characteristic surfaces of the hyperbolic partial differential 
equation is found irrelevant in the quantized theory; it seems sufficient to quantize the theory 
on one of the charateristic hyperplanes. 


’’‘Research partially supported by the Department of Energy under contract DE-AC03-76SF00515. 
tSlac report: SLAC-PUB-8219, August 1999, 

^E-mail: (1) prem@lafexsul.lafex.cbpf.br; (2) prem@siac.stanford.edu. On leave of absence from Instituto de 
Ftsica, UERJ- Universidade do Estado de Rio de Janeiro^ RJ, Brasil- 



438 


Prem P Srivastava 


Contents 

1 Introduction 439 

1.1 Light-Front Quantized Theory 440 

1.2 LF Poincare and IMF Generators. LF Spin Operator 441 

2 LF quantized scalar theory 441 

2.1 Covariant Phase Space Factor on the LF 441 

2.2 LF Commutator 442 

2.3 Length Dimensions L±_ and Ly 442 

2.4 LF Hamiltonian.' Dirac Procedure 443 

2.5 Scalar Field Propagator in momentum space 444 

2.6 First class constraint. Symmetry in and x~ 445 

3 SSB Mechanism, Topological Kink Solution, and Chiral Boson theory on LF 445 

3.1 SSB in two dimensional scalar theory 445 

3.2 Spontaneously broken continuous symmetry 447 

3.3 Kink solution and Topological quantum number 448 

3.4 Chiral Boson theory on the LF 448 

4 LF quantized Dirac field 451 

4.1 Anticommutators 451 

4.2 LF Spinor in momentum space and its properties 452 

4.3 Fermion propagator ' 453 

4.4 Fs Symmetry. Chirality transformation on the LF 454 

4.5 Helicity Operator, LF Majorana and Weyl fermions : 456 

4.6 Bilocal operators 457 

5 LF quantization of Gauge theory 457 

6 Vacuum Structures in Schwinger and Chiral Schwinger Models 459 

7 QCD in Covariant gauges 463 


8 Conclusions 


465 



Perspectives of Light-Front QFT 


439 


1 Introduction 


Half a centuary ago, Dirac [1] discussed the unification, in a relativistic theory, of the principles 
of the quantization and the special relativity theory which were by then firmly established. The 
Light-Front (LF) quantization which studies the relativistic quantum dynamics of physical system 
on the hyperplanes : -f ~ = const. ^ called the front form theory, was also proposed 

and some of its advantages pointed out. The instant form or the conventional equal-time theory 
on the contrary uses the x° = const, hyperplanes. The LF coordinates : (x'^,x“,x“^), where 
= (x^±x^)/\/2 = Xrp and x-^ = (x^,x^) = (— xi, — X 2 ), are convenient to use in the front form 
theory. They are not related by a finite Lorentz transformation to the coordinates (x^ = tyX^, x^, x^) 
usually employed in the instant form theory and as such the descriptions of the same physical 
content in a dynamical theory on the LF, which studies the evolution of the system in x"^ in 
place of x^, may come out to be different from that given in the conventional treatment. This 
was found to be the case, for example, in the description of the spontaneous symmetry breaking 
(SSB) mechanism (Sec. 3) some time ago and in the recent study (Sec. 6) of some soluble tw'o- 
dimensional gauge theory models, where it was also demonstrated that LF quantization is very 
economical in displaying the relevant degrees of freedom, leading directly to a physical Hilbert 
space. The LF quantized field theory may perhaps also be of some relevance in the understanding 
of the unification of the principles of the quantization with that of the general covariance [2]. 

We recall that the field theory at infinite momentum was employed in the context of the current 
algebra sum rules [3]. The Feynman rules adapted for infinite momentum frame (IMF), which 
were used by Weinberg [4], showed substantial simplifications in the context of the old fashioned 
perturbation theory computations. In the deep inelastic region with the IMF limit Bjorken [5] 
predicted the scaling of the deep inelastic structure functions. The parton model [6] of Feynman 
was also formulated in the IMF. At the same time the connection between the use of the LF 
variables and the IMF limit was being noticed by several authors [7], which prompted gradually 
the interest in the study of the front form theory as proposed by Dirac. 

More recently, the interest in LF quantization has been revived [8, 9, 10] due to the difficulties 
encountered in the computation, in the conventional framework, of the nonperturbative effects 
in the context of QCD and the problem of the relativistic bound states of fermions [8, 9] in the 
presence of the complicated vacuum. Studies show [9, 8, 11] that the application of Light-front 
Tamm-Dancoff method may be feasible here. The technique of the regularization on the lattice 
has been quite successful for some problems but it cannot handle, for example, the bound states of 
light ( chiral) fermions and has not been able yet to demonstrate, for example, the confinenment of 
quarks. The problem of reconciling the standard constituent quark model and the QCD to describe 
the hadrons is also not satisfactorily resolved. In the former we employ few valence quarks while 
in the latter the QCD vacuum state itself contains, in the conventional theory, an infinite sea of 
constituent quarks and gluons ( partons) with the density of low momentum constituents getting 
very large in view of the infrared slavery. The front form dynamics may serve as a complementary 
tool to study such probelms, since we may possibly arrange to have a simple vacuum in it while 
transfering the complexity of the problem to the LF Hamiltonian. In the case of the scalar field 
theory, for example, the corresponding LF Hamiltonian is, in fact, foimd [12] to be nonlocd due 
to the presence of constraints on the LF phase space. 

The LF quantization of QCD in its Hamiltonian form provides an alternative approach to 
lattice gauge theory for the computation of nonperturbative quantities, such as [8, 9] the spectrum 
and the LF Fock state waveftinctions of relativistic boimd states. LF variables have found natural 
applications in several contexts, for example, in the quantization of (super-) string theory and M- 
theory [13]. They have also been employed in the nonabelian bosonization [14] of the field theory of 
N free Majorana fermions. The vacuum structures [15, 16] in the LF quantized Schwinger model 
(SM) and the Chiral SM (GSM) have been recently studied in a transparent fashion. The LF 
quantized QCD in covariant gauges has also been studied [17] in the context of the Dyson-Wick 
perturbation theory, where it is shown that the lack of manifest covariance in the calculations 



440 


Prem P Srivastava 


becomes more tractable thanks to a useful form of the LF spinor introduced (Sec. 4). The relevant 
propagators are shown to be causal and the Wick rotation can be performed [18] to go over to the 
Euclidean space integrals allowing for the dimensional regularization to be used. The front form 
theory has also found applications in the nonperturbative sector of QCD in the context of the 
Bethe-Salpeter dynamics. The Covariant Instaneity ansatz (CIA) [19] introduced earlier, which 
invokes the Markov- Yukawa Transversality Principle, has been extended now to the covariant null 
plane (CNPA) [20, 21]. 

1.1 Light-Front Quantized Theory 

We will make the convention to regard x"*" = r as the LF-time coordinate while x" as the longitu- 
dinal spatial coordinate. We note that = [x '~ = —i where 5^ = 9^: = (5^ ±d^)/^/2 

etc. so that the coordinates x'*' and x"“ appear in a symmetric fashion. In terms of the null vector 

= (1,0, 0, 1)/V5 and its dual n = (1,0,0, -l)/\/5, with n*n = n*n = 0, n-n = 1, they may 
be written also as x"*" = n • x and x*” = fi • x (See also Sec. 5) . The temporal evolution in x^ or x^ 
of the system is generated by the Hamiltonians which are different in the two forms of the theory. 

Consider [16, 10] the invariant distance between two spacetime points : (x — = (x^ - — 

(x = 2(x“^ -“y"^)(x"“ ~y”) — (x-^ On an equal x^ = y^ = const, hyperplane the points 

have spacelike separation except for if they are coincident when it becomes lightlike one. On the 
LF with x“^ = y*^ = const, the distance becomes independent of (x“ — y”*) and the seperation is 
again spacelike; it becomes lightlike one when x*^ = but with the difference that now the points 
need not necessarily be coincident along the longitudinal direction. The LF field theory hence need 
not necessarily be local in x”", even if the corresponding instant form theory is formulated as a 
local one. For example, the commutator [A(x‘^,x'“,x-^),B(0,0, 0-^)]x+=o of two scalar observables 
would vanish on the grounds of microcausality principle in relativistic field theory for x*^ ^ 0 
when x^ |a;+=:o is spacelike. Its value would hence be proportional to (J^(x^) and a finite number 
of its derivatives, implying locality only in X'^ but not necessarily so in x“. Similar arguments 
in the instant form theory lead to the locality in all the three spatial coordinates. In view of the 
microcausality both of the commutators [A(x),B(0)]a.+=o and [A(x),5(0)]a;0=.o are nonvanishing 
only on the light-cone, x^ = 0. The possibility of nonlocality in the longitudinal direction in the 
front form treatment seems to allow us to display in some cases the structures parallel to those 
found in string theory (Sec. 4.6). 

We note that in the LF quantization we time order with respect to r (which is a monotonic 
parameter as well) rather than t. The microcausality principle, however, ensures that the retarded 
commutators [A(x),£(0)]^(x^) and [A(x),.B(0)]^(x+), which appear [22] in the S-matrbc elements 
of relativistic field theory, do not lead to disagreements in the two formulations. In the regions 
X® > OjX"^ < 0 and x^ < OjX"^ > 0, where the commutators seem different the x^ is spacelike 
and both of them vanish. Hence, admitting [23] the microcausality principle to hold, the LF 
hyperplane seems equally valid and appropriate as the conventional one of the instant form theory 
for the canonical quantization. This is demonstrated to be so, for example, in the context of SSB, 
SM, CSM, and QCD in covariant gauges discussed in this article. 

We note that the hyper planes x=^ = 0 define the characteristic surfaces of hyperbolic partial 
differential equation. It is known from their mathematical theory [24] that a solution exists if 
we specify the (Cauchy) initial data on both of the hyperplanes. Prom the actual computations 
in the front form theory we come to conclusion [16] that (barring some massless theories) it is 
sufficient in the canonical quantization of the front form theory to select one of the hyperplanes. 
The information on the commutators on the other characteristic hyperplane seems to be already 
contained [15] in the quantized theory. 

A distinguishing feature of the front form theory is that it gives rise to a constrained dynamical 
system [25]. After the elimination of the phase space constraints in the Hamiltonian formulation it 
leads to an appreciable reduction in the number of independent field operators which would describe 
the Hilbert space of the theory. The vacuum structure, for example, then becomes more tractable 
and the computation of physical quantities simpler. This is, for example, verified [15, 16, 17] in 
the studies of the LF quantized SM, CSM, and QCD in covariant gauges reviewed in Secs. (6, 7). 



Perspectives of Light-Front QFT 


441 


1.2 LF Poincare and IMF Generators. LF Spin Operator 

The structure of the LF phase space is different from that of the one in the conventional theory. 
A different description of the same physical content, compared to that found in the conventional 
treatment, may emerge in the front form theory. For example, the SSB gets a different description 
[ 32 , 10 ] and the broken continuous symmetry is now inferred from the study of the residual unbroken 
symmetry of the LF Hamiltonian operator while the symmetry of the LF vacuum remains intact. 
However, the expression which counts the number of Goldstone bosons present in the front form 
theory, comes out to be the same as that found in the the discussion of equal-time quantized 
theory. A new proof of the Coleman’s theorem [ 26 ] on the absence of the Goldstone bosons in two 
dimensional theory also emerges [ 32 , 10 ]. The LF vacuum is generally found to be simpler and 
in many cases the interacting theory vacuum is seen to coincide with the perturbation theory one 
[ 27 ]. 

Another important advantge pointed oilt by Dirac of the front form theory is that here seven 
out of the ten Poincare generators are kinematical^ e.g., they leave the plane = 0 invariant 
[1]. In the standard notation, viz., Ki = = 1 , 2 , 3 , they are : 

P+,P^P^ ^ j^p+- ^ ^ ^ J2)/\/2, and A/+- = (A's - Ji)/v^. 

In the conventional theory on the other hand only six such ones, P and = —AP\ leave 
the hyperplane = 0 invariant. Expressed otherwise, the generator A's is dynamical one in 
the instant form theory but it turns out to be kinematical on the LF in the sense that there it 
generates [ 15 ] simply the scale transformations of the LF components of P^ and and 
with /i,z/ = +, 1,2. 

There is also an interesting correspondence of the LF components of the Poincare gen- 
erators with the generators in the IMF. Consider the inertial frame moving along the 
3 -axis with velocity v/c = tanh?; relative to the inertial frame S. ^From = 

exp{—ir]Kz) (M^^,P^) exp{irjKz) we derive (Appendix A) 

J\ = cosh T]-\- K-i sinh 77, J'2 = J2 cosh p- K\ sinh 77, J'3 = J3 

K\ = Ki cosh 77 - J2 sinh 77, K^2 = K2 cosh 77 -h Ji sinh 77, K'z = A’s 

(Po + Ps/ = e^(Po + P3) (Po-P3)' = e-"(Po-P3) P^=Pi P '2 = F 2 (1) 

When vjc 1 (-* 1 ) or 77 oo('-oo) the Lorentz transformation becomes singular. However, we 
may define the renorTTiafeed generators, cosh 77, Ar'a/cosh77, and e^^(Po iPs)' which have 
well defined limits. The generators thus obtained coincide in the limit with the LF components of 
the Poincare generators. We note also that to particle at rest in S corresponds the four-momentum 
p‘^ in the inertial frame S’: p’ cosh 77), which tends to a null vector. 

It is also worth remarking that the -f component of the Pauli-Lubanski pseudo-vector is 
special in that it contains only the LF kinematical generators. We may define the LF Spin operator 
by Jz = In the masssive case the other two co mpone nts of J, generating together 

an 311 ( 2 ) algebra, are shown to be Ja = -[JzP’" + a = 1,2, which, however, do 

carry in them also the LF dynamical generators P“,M^",M^”. The case of both the massive and 
massless fermions is discussed in detail in Sec. 4 ; the general case is considered in Appendix B. 


2 LF quantized scalar theory 

2.1 Covariant Phase Space Factor on the LF 

Some interesting insight on the front form quantized field theory may already be gained by con- 
sidering the Lorentz invariant phase sp&ce-LlPS or Covariant phase space [28] factor, which is 
found relevant in the analysis of the physical processes, introduced first in the context of the 
covariant version of the statistical model of Fermi [29]. On the LF the dispersion relation asso- 
ciated with the free massive particle is 2p'^p~ = (p^P'^ +m^) > 0. It has no square root, like 
in the conventional one. The signs, for example, of p"*” and p are correlated since > 0 

[30]. The LISP factor in the LF coordinates is thus defined as: / d^p e{±p+)9{±p-)Sip^ -m^} = 



442 


Prem P Srivastava 




f cPp-^dp'^ J dp 6{±p'^)9{±p )S{2p'^p 

the conventional one: f d^p6{±p^)5(pP‘ - m?) = J d^p/{2Ep) with Ep 


) = f d^p'^dp'^9(p'^)/(2p'^), compared to 

A 


+ \/^ “f TTi^ > 0. 

distinguishing feature in the case of the LF is thus the appeaxence of (9(p+)/(2p+) in the phase 
space factor. Such considerations are relevant, for example, in writing the Fourier transform of the 
fields and the discussion of chiral boson theory (Sec. 3.4). 


2.2 LF Commutator 

Consider, for example, a real massive free scalar field (j>{T,x~ ,x^), satisfying + Tn?)<j) — 0 
where - (25+5_ - di^). For > 0, and consequently p~ > 0, the complete set of plane wave 
solutions of the equation of motion are and e~‘P* where p-x = p~x'^ + p+x~ - p-^x-^ and 

r = indicates the LF-time coordinate. The Fourier transform of (j> on the LF may clearly be 
written as, 

-f [o(p+,p-^)e''‘P ® +o^(p+,p^)e‘P “] (2.1) 


where we have isolated -y/2p+ only for latter convenience and p-^ as well as p"*” are to be integrated 
from — oo to oo, which is very convenient when we deal with generalized functions like 6(p”^). 
In the quantized theory a(p) and a^(p) denote the creation and annihilation operators of the 
quantum excitations associated with the quantized field operator (p. They are assumed to satisfy 
the canonical commtation relations, with the nonvanishing one given by [a(fc),a'^(p)] = 5(k'^ — 
p+)5^(fc*^ —p-^) ^ S^{k -p). The Fock space is constructed employing these operators. 

The equal-LF-time commutator of the field operator can be computed by employing its Fourier 
transform expression 


[4>ix),4>iy)]r = j 


dp+dfc+g(p+)g(fc+) 

■y2p+2A:+ 


X S^(p-k) 

L lx+=y+=r 


-y )S^ix-‘- -y-^). 


ip^{x -s/“) 


( 2 . 2 ) 


Here we have used the free particle dispersion relations for k^ and p^, made use of the delta 
function in the integrand, set [^Cp"^) -f 6{—p'^)] = 1 (or rather the Cauchy principal value-CPV) 
in the sense of the distribution theory, and used the integral representation of the sign function 
e{x) = 1 or = ~1 according as a: > 0 or xr < 0. The equal-r commutator obtained here, often 
termed the LF commutator, is nonlocal along the longitudinal direction x“, which as we argued 
before is not unexpected in the front form theory. It vanishes for the spacelike distances, and is 
nonvanishing only on the light-cone for when we assume e(0) = 0. 


2.3 Length Dimensions Lx L\\ 

It is natural and suggested also, for example, from the expression of the LF commutator to introduce 
[9] two distinct units of length dimensions, Lj_ and Ljj in the front form theory. Indicating the 
dimension of a quantity by [..] we write: [x-^] = Lx, [^~‘] = Ly, [d-.] = 1/L||. Requiring that 
p-^ • x-^ be dimensionless we find [p-^] = [m] = 1/Lx, if we recall the dispersion relation. Making 
similar arguments we find [p+] = 1/Ljj, jp''] = L||/(Lx)^, [a:+] = (Lx)VL|| while [H^^] = [P"] = 
L||/(Lx)^ for the LF Hamiltonian and = l/(Lx)^ for the Hamiltonian density. Similar 
considerations apply to the other composite operators like current densities and we remark that 
9(x) and €{x) do not carry any dimensions. The dimensional analysis is useful in finding [9] the 



Perspectives of Light-Front QFT 


443 


possible counter terms required in the renormalization of the theory. ^From the LF commutator 
( 2 . 2 ) we conclude that [<^] = l/Li, which is also found to be the case for the gauge field but for 
the fermionic field we have [^ 4 -] = l/(Lx\/ijj), where ^ 4 . indicates the dynamical component of 
the fermion field. 

2.4 LF Hamiltonian. Dirac Procedure 

The free scalar theory is described by the Lagrangian density C = d+(j>d^(j> - {l/2)d±(j)d±(j> — 
m^4!^ /2, It is first order in and the canonical momenta defined as tt = dCld{djf.(f) = 
describes a constraint on the phase space dynamics of the front form scalar theory. We have 
here a constrained dynamical system [25]. The canonical Hamiltonian density is found to be 
=m?<jp j 2, There is a systematic procedure^ - called the Dirac method [25]-- which allows us to 
construct the self-consistent Hamiltonian formulation, required to canonically quantize the theory 
with phase space constraints. The primary constraint above is written as 


X= (7r-a_(^) (2.3) 

where » stands for weak equality, meaning that it should not be employed inside the Poisson 
brackets, but only after they have been computed. 

We define next an extended Hamiltonian density Hq :=.FLc-Fux where u is a Lagrange multiplier 
field. Hamiltons equations of motion employ He ^ J <Px^dx~'He and we require the persistency 
condition on the constraint, e.g., {x{t,x~ , x^)jHe{T)} ^ 0. In the simple case under study we are 
led to a differential equation which would determine the multiplier field u. In the gauge theory 
considered below new secondary constriants may arise. We now include them, also in the extended 
Hamiltonian and repeat the procedure, untill no more constraints show up. For the computational 
purposes we may initially start with the standard Poisson brackets at equal-LF-time r, with the 
nonvanishing one defined by^ {7r(r, r), ^(r, y)} = —5^{x — y) = S^(x-^ - y^)S(x“' - y“). 

The nature of the set of constraints found in the theory is then analyzed. A constraint is first 
class if it has vanishing Poisson brackets with itself, with the the other constraints, and with the 
Hamiltonian; otherwise it is a second class one. Corresponding to a first class constraint we may 
be required to add in the theory some appropriate and accessible gauge-fixing external constraints 
[25]. In the present case their is one local constraint x ^ 0- Prom the constraint matrix 

= -25^(x-‘- - - y"). (2.4) 

we conclude that it is second class by itself, since the right hand side is nonvanishing. There is, in 
fact, also a first class constraint in the theory in the form of the zero-momentum-mode of x; we 
will comment on it in Sec. (2.6). 

We go over now from the Poisson to the modified Poisson brackets, called Dirac brackets, which 
have the property that we are allowed to set x = 0 as a strong equality relation, valid even inside 
the Dirac brackets. The Hamilton’s equations employ [25] now the modified brackets. We construct 
first the inverse of the constraint matrix: x(’')y~.y‘')}“^ = -5^(x-‘- -y-“-)e(x~ - 

y~)/4. The modified bracket is then defined by 

{fix),g(,y)}D = {fix),g{y)}-j j d\ifv{fix),xiy')}{xi‘>^),xiv)}~Hx(v),9iy)} (2-5) 

In view of its very constructuion the Dirac bracket of any dynamical variable with x is seen to 
vanish identically. 

It is clear that in place of He we may then employ the reduced Hamiltonian obtained by setting 
- 5_?i) = 0 in it, which would also remove the Lagrange multiplier field, while ir becomes 

^See also Secs. 5,6, and Appendix C. 

Hn the context of the canonical quantization we mostly deal with equal-r brokets and commutators. We will 
fequently suppress t from writing and write occasionally x to indicate the set (x ,x ). 



444 


Prem P Srivastava 


now a dependent variable, e.g., removed from the theory. For the independent field (j) which survives 
in the front form scalar theory here considered we find 

{(P{r,x),<piT,y)}D = -y^) (2.6) 

The Hamilton’s equation: ^{t,x) = {4>{r, x), Hc}d, where an overdot indicates the derivation with 
respect to r, does recover also the Lagrange equation. The theory is canonically quantized by the 
correspondence -> [f,9] , the commutator of the corresponding quantized operators. The 

Hamiltons equations correspond to the equations of motion of field operators, e.g., idf/ dr = [/, H], 
in the Heisenberg picture. The commutator of the scalar field operators on the LF is thus given 
by 

[(f>(T,x),(j>iT,y)] = - y~)5^{x-^ - y-^) (2.7) 

which is the same as found above by the simple arguments based on the Fourier expansion of 
the field in the front form theory. Employing this commutator we recover in the present case the 
Lagrange equation of motion for the field operator as well. 


2.5 Scalar Field Propagator in momentum space 

The Fourier expansion (2.1) may also be regarded as furnishing the momentum space realization 
of the commutator (2.7) and the propagator in momentum space is easily derived. The propagator 
in configuration space is defined by 

(oi Timm) 10 ) = m (oi imm) io ) + em (oi (mm)) io) • (2.8) 


It follows that 

{o\T{mm) io) 


(2 

(^ 


^ [0(r)e-^’- +0(-r)e«P-] 

tt )^ j 


d pdXe 


{m^ + - 2p+ A - it) 






+ it) 


(2.9) 


Here we have used the integral representations^ of 6{±t) and performed the well known standard 
manipulations. The factor [0{p'^) + 9{—p'^)] in the integrand has been set to unity and the dummy 
integration variable A has been renamed as p“ for convenience in the last line. The stands for 
(jPp-^dp'^dp~ , with the understanding, as is clear from the derivation above, that the integration 
over the p~ has to be performed first. The range of integration is from —00 to 00 for all of these 
variables. 

The momentum space representations of the energy-momentum tensor are also found easily 
and we check that N{p) = a^(p)a(p) has the usual interpretation of a number operator. In fact, 




P+ 


d'^x'^dx 




P-=/ 

i J d^p^dp^eip^): [a^ip)a{p) +a{p)a^(j>)] : 


2 I _L _L 
+ p^p-^ 

2p+ 


J (fp-^dp'^ 9 ip'^) [at(p)a(p)] p 
J dPx'^dx~ : (d-(p)^ : 

J d^p^ dp'^Oip^) [a^(p)a(p)] p"*" 


(2.10) 


= 1/(2177) f dAe-*^7(p- -A-ie) 



Perspectives of Light-Front QFT 


445 


2.6 First class constraint. Symmetry in x'^ and x 

It is worth making an important remark There is, in fact, present [31] in the scalar theory discussed 
above still another constraint which is first class. We easily show that the zero-longitudinal- 
momentum mode y/^x{T,k'^ = 0) = / dx'~x^ represents a first class constraint in the theory. 
For example, considering for simplicity the two dimensional theory, (2.4) reads in the momentum 
space as 

= -2ik'^5{hfi' ( 2 . 11 ) 

It clearly indicates the presence of the first class constraint x(r, = 0) « 0 in the theory. 
Such a constraint or symmetry requires us to introduce in the theory an external (gauge-j&xing 
) constraint [25], such that the pair becomes a second class set. We will take advantage of this 
gauge freedom in order to decompose the scalar field into the bosonic condensate variable and the 
quantum fluctuation field. When combined with the standard Dirac procedure it allows us to build 
[32, 10] a description of the SSB mechanism on the LF. 

We also note that the front form formulation of relativistic theory is inherently symmetrical 
with respect to and x'~ and it is a matter of convention that we take the plus component as the 
LF-time while the other as a spatial coordinate. The theory quantized at x~^ = const, hyperplanes 
seems already to incorporate [15] in it the information on the equal-x“ commutation relations. For 
example, we easily derive from (2.1) the following equal -“ 0 ;“ commutator 


[(t>{x^,x ,y^)] = 

(2x)3 2p+ r 




( 2 . 12 ) 


In view of the free particle dispersion relation we may replace the measure dp'^6(p'^)f2p'^ by 
dp'~d(j)~')/2p^. The equal-a;~ commutator is then given by {—i/{‘in))€{x~^-y'^)S^{x-^—y'^). In two 
dimensional space-time it is customary to define the right and the left movers by <^(0, x~~) = 
and (p{x^,0) = We find [0^(x"),0'^(y'")] = {-i/4)e{x'^ -y'~) while = 

{—i/i)e{x'^-‘y'^). The symmetry under discussion seems responsible for appreciable simplifications 
in the front form quantized theory. 


3 SSB Mechanism, Topological Kink Solution, and Chiral 
Boson theory on LF 

3.1 SSB in two dimensional scalar theory 

The conventional instant form description of the tree level SSB is based on the space and time 
independent solutions of the Lagrange equation, (fciass = such that they also minimize the 
Hamiltonian functional; based on the (external) physical considerations. We do not apparently 
have much physical intuition on the LF to avail of such arguments. The constrained dynamical 
system on the LF seems, however, to already contain in it the corresponding relevant constraints. 
For simplicity we consider first the two dimensional theory^ with £ = ~ 

This is probably the simplest example of a constrained dynamical system in the context of field 
theory. It is reasonable to expect that the well tested Dirac procedure, when applied to it, must 
result in a satisfactory description of SSB on the LF. 

The Lagrange equation, 20' = is of first order in LF-time r. The left hand side 

remains unaltered under 0-^0 + ^ ~ const, are clearly possible solutions^. Integrating 

over the space variable and assuming appropriate boundary conditions we are led to the following 

^Here t = + x^)/\/2, x = = (x° - x^)/\/2. An overdot indicates the LF-time derivative while a prime 

indicates derivative with respect to x”*; the generalization to 3 + 1 dimensions is discussed in Sec. (3.2). 

®The self-dual kink solution which depends on x”" as well is discussed in Sec. (3.3). 



446 


Prem P Srivastava 


constraint [32, 10] on the potential 

We show now that this constraint is also present on the phase space and in the quantized theory. 
The description of SSB then follows from the discussion on the structure of the Hilbert space. 

In order to take care of the first class constraint / 0 mentioned in Sec. (2.6) we make 

the following separation of the dynamical (collective) bosonic condensate variable u;(r) from the 
(quantum) fluctuation variable </>(r, x) 

(f>{T,x) = u{t) + cp{r,x). (3.2) 

Here we also set / dx~'^(r,x) = 0 so that the fluctuation field carries no zero-longitudinal- 
momentum mode in it. The separation thus corresponds in a sense to an external gauge-Gxing 
constraint which we must impose [25] in the theory. It was introduced [32] originally on physical 
considerations and cj was termed as the dynamical bosonic condensate variable. 

We apply now the standard Dirac procedure to construct LF Hamiltonian formulation. The 
canonically quantized theory results [10, 32] in the following commutators 

(3.3) 

[w(r),¥’(2:,T)] = 0. (3.4) 

and for V{(j)) = {X/i){</P — with a negative sign for the mass term and A > 0, m 5^ 0, the 

LF Hamiltonian is given by 

jjiS = = J dx j^a;(A6c;^ — m^)ip -f ~ m^)^^ -f Xujip^ + . (3.5) 

We recover also the constraint equation (3.1) now as a second class constraint on the phase space: 

uj{Xuj^ - m^) + / dx\ (3Aa;^ — m^)(p -f X(3uj(p^ -1- == 0 (3.6) 

^ J-R/2 *- 

where i? oc and the Cauchy principle value of dx f{x) is defined by limR^oo /in/2 dxf{x). 

The commutation relations indicate that the operator u; is a c-number or a background field. 
Eliminating u would lead to LF Hamiltonian which is nonlocal [32, 10] along the longitudinal 
coordinate x^ even though the scalar theory written in the conventional coordinates is local. 

At the tree or classical level, (p are bounded ordinary functions in x”" and when E -4 oo only 
the first term survives in the constraint equation leading to cj(Aa;^ — m^) = 0. This result is 
the same as that obtained in the conventional theory. There, however, it is essentially added to 
the theory, on physical considerations, which require the energy functional to attain its minimum 
(extremum) value. The stability property, say, of a particular constant solution may be inferred 
as usual from the analysis of the classical partial differential equation of motion. For example, 
a; = 0 is shown to be an unstable solution on the LF for the potential considered above, while the 
other two solutions with u ^ 0 give rise to the stable phases. A similar analysis, it is clear, of the 
corresponding partial differential equations in the conventional coordinates can also be made; the 
Fourier transform theory is convenient to use. Also the new ingredient in the form of the constraint 
equation on the LF does have its counterpart in the conventional instant form framework as is 
shown in [18]. It is remarkable that the front form theory seems to contain inside it all the 
necessary ingredients in order to describe the SSB, when we follow the Dirac procedure to handle 
the constrained LF dynamics of the scalar field. 

We could have employed the DLCQ [33], including the condensate term also in it. The existence 
of the continuum limit of DLCQ theory adding to it also the dynamical condensate variable was 



Perspectives of Light-Front QFT 


447 


demonstrated [32, 34], contradicting the then prevalent notion on the contrary®. The demonstration 
assures [35] us of the self-consistency of the front form theory itself. In the infinitie volume limit 
[10, 34], we do obtain the same results. However, in the theory described in finite volume, the 
commutator of uj with cp is found nonvanishing and as such it is an operator, only when i? -4 oo 
does it becomes a classical background field. 

It is worth stressing that in our discussion the condensate variable is introduced as a dynamical 
variable. The Dirac procedure must decide whether it comes out to be c- or q-number. In the 
discussions of the bosonized SM and GSM models the operator cj is not a background field, like 
in the scalar theory. It turns out to be an operator and plays an important role in describing the 
structure of the Hilbert space and the degenerate vacua in these gauge theories (Sec. 6). 

The field commutator obtained above can be realized in momentum space through the Fourier 
transform of p: p{x,t) = (1/v^) f dk d(k) [a(^,r) where k'^ =k 

and the operators a{k^r) and a^(A;,r) satisfy the canonical equal-r commutation relations, with 
the nonvanishing one given by [a{k, r), a}(k’, r)] = 5{k - k‘). 

The (perturbative) vacuum state is defined by a{kyT)\vac) = 0, A: > 0. The tree level de- 
scription of the SSB is given as follows. The values of a; = (i<?i>|)^ac obtained from V^w) = 0 
characterize the different vacua in the theory. Distinct Fock spaces corresponding to different val- 
ues of (jj are built as usual by applying the creation operators on the corresponding vacuum state. 
The a; = 0 corresponds to symmetric phase since the Hamiltonian operator is then symmetric 
under (p ->■ -p. For cj 0 this symmetry is violated and the system is said to be in a broken or 
asymmetric phase. 

The constraint equation (3.6) also shows that the value of u would be altered from its tree level 
value in view of the quantum corrections, arising from the other terms. The renormalization of 
the two dimensional scalar theory was discussed [18] to one-loop order by emplo 3 dng the Dyson- 
Wick expansion based on LF-time ordering. It was found that it is convenient to derive [18] 
the renormalized constraint equation instead of solving the constraint equation first, which would 
require the difficult job of dealing with nonlocal and nonlinear Hamiltonian. 

In the supernormalizable theory here the two renormalized equations, viz, the mass renormal- 
ization condition and the renormalized constraint equation, allow us to to study [18] the phase 
transition in the two dimensional scalar theory, which was conjectured long time ago by Simon 
and Griffiths [36]. 

3.2 Spontaneously broken continuous symmetry 

The extension to 3-1- 1 dimensions and to the global continuous symmetry is straightforward [10]. 
Consider real scalar fields (f>a{a = 1,2, ..iV) which form an isovector of global internal symme- 
try group 0{N). We now write <j)a{x,x-^,T) = cJa + <Pa(x,x-^,T) and the Lagrangian density is 
£ = [(fa^a “ 0-/'^)id±Pa)id±(pa) “ ^(^)]* The Taylor series expansion of the constraint equations 
/3a = 0 gives a set of coupled equations R V^{uj) -h {u) f dxpb + ^abd^) I dxipbPcd^ + ... = 0. Its 
discussion at the tree level leads to the conventional theory results. The LF symmetry generators 
are found to be Ga(r) = f (Px-^dx<p^^(ta)cd(Pd = f dk9{k) ac{k,k^)dta)cdO>d{k,k^) where 
a, /? = 1, 2, N{N - 1)/2 , are the group indices, ta are hermitian and antisymmetric generators of 
0(iV), and ac{k, k-^d {(^cik.k-^)) is creation ( destruction) operator, contained in the momentum 
space expansion of These are to be contrasted with the generators in the equal-time theory, 
Qaix^) = fd^xJ^ = -if d^x{do(Pa){ta)ab<Pb - s d^x{dpa/dxQ). All the symmetry gen- 

erators thus annihilate the LF vacuum and the SSB is now seen in the broken symmetry of the 
quantized theory Hamiltonian. The expression which counts the number of Goldstone bosons in 
the front form theory is found to be identical to that in the conventional theory. In contrast, the 
first term on the right hand side of Qa{x!^), which is similar to the one on the LF, does annihilate 
the conventional theory vacuum but the second term gives now non-vanishing contributions for 
some of the (broken) generators. The symmetry of the conventional theory vaouum is thereby 
broken while the quantum Hamiltonian remains invariant. The physical content of SSB in the 

6see, T. Maskawa and K. Yamawaki, Prog. Theo. Phys. 56 (1976) 270; K. Nakanishi and K. Yamawaki, NucL 
Phys. B122 (1977) 15. A history of the so called zero mode problem is traced [10] in hep-th/9312064. 



448 


Prem P Srivastava 


instant form and the front form, however, is the same though achieved by different descriptions. 
Alternative proof on the LF, in two dimensions, can be given of the Coleman’s theorem related to 
the absence of Goldstone bosons; we are unable [10] to implement the second class constraints over 
the phase space. The tree level Higgs mechanism may also be discussed straightforwardly [10]. 
We remark that the simplicity of the LF vacuum is in a sense compensated by the involved nonlo- 
cal Hamiltonian. The latter, however, may be treatable using advance computational techniques. 
Also in connection with renormalization it may not be necessary [18]; we may instead obtain the 
renormalized constraint equations. 


3.3 Kink solution and Topological quantum number 


The classical Lagrange equation of the two dimensional self-interacting theory, 2d-d+4> = -V'{4>) , 
with the V (0) given above, is known to have finite energy topological soliton solutions [37] called 
kirA solutions. The theory has an internal symmetry, (j) — )■ —cp. They can be recovered in the 
front form theory as well. The kink corresponds to the self-dual solution satisfying d-(f) = —d+(j> 
and given by 


TO 


Pkink — ^ 




(3.7) 


where the upper (lower) sign corresponds to the kink (anti-kink) solutions. The kink on the LF 
carries both the LF ene rgy and longitudinal momentum such that P'^ = and^ its mass is 
determined to be M = >/ 2 P+P*~ = \/8m^/(3A). The kink interpolates between the two vacuua of 
the theory: = oo) = -m/y/X and (pkinkiO^x’' = -oo) = mfVX> The topological charge 

may be defined by Q = / dx~ where = — (\/A/( 2 m))e^^di, 0 , with = eoi = 1 ? 

is the conserved topological current density. The toplogical charges of kink, anti-kink and vacuum 
solutions are 1 , - 1 , and 0 respectively. The Q is absolutely conserved prohibiting the decay of the 
kink into vaccum. Similar (topological) quantum numbers on the LF arise also, for example, in 
the context of the structure of the degenerate vacua in the canonical quantiztion of SM and CSM 
models discussed below. 


3.4 Chiral Boson theory on the LF 

The chiral boson (or self-dual scalar) field in 1 -f 1 dimensions plays an important role, for example, 
in the formulation of string theories [38], in the description [39] of boundary excitations of the 
quantum Hall state, and in a number of two- dimensional statistical systems which are related to 
the Coulomb-gas model. 

We recall that the free massive theory with C = -- rrPcfP 12 has the LF Hamiltonian 

(2 . The dispersion relation 2p'^p“” = rr? > 0 governs the correlation between the signs of 
p'^ and p"*. It ceases to exist in the massless theory where, at the classical level, a chiral boson 
solution, 9o0 = di(j> (and an anti-chiral one, d^cj) = —di(j)) is obtained. Several quantized theory 
models [40, 41, 42] of chiral boson have been proposed. The front /orm theory of chiral boson looks 
more appropriate and transparent [43] when compared to the conventional one. 

The Floreanini and Jackiw (FJ) model [41] is based on the following manifestly non- covariant 
Lagrangian 

£ == {do(j> - di(f>)di(j) 

= (3.8) 

where ^ is a real scalar field and = — 77 ^^ = 1 , 77 ^^ = 77 ^° = 0 . 

In the instant form frame work it leads [41, 44] to the following equal-time commutator 

[<p{x°,x^),<p{x°,y^)] = -y^). (3.9) 


’■p- = / dx-V{<f>) and F+ = J 



Perspectives of Light-Front QFT 


449 


The commutator is nonvanishing, is nonlocal, and violates the microcausality principle, contrary 
to what we encounter usually in the conventional theory [23]. These objections disappear when we 
consider the theory quantized in the LF coordinates. 

We will consider a modified FJ chiraJ boson model with the following Lagrangian density written 
in the LF coordinates 

C = {d+4> - -d-4)) d-cj) 

Ot 

= (3.10) 

I Ol 

where rf^~' = r}~'^ = = rj = 0 and a is a fixed parameter. For a = 1 it coincides with 

(3.8) in the conventional coordinates. 

The LF quantization of the scalar theory with a potential term included in it has been discussed 
in Sec. (2.4). Prom (3.10) we derive 

= f dx- -id-(l>f 

J Ot 

[(f){T,x~),^{r,y~)] = -^e{x~-y~) (3.11) 

The LF commutator (3.11), which is nonlocal in x~ and nonvanishing only on the light-cone, does 
not conflict with the microcausality (Sec. 1.1 and [23]) unlike (3.9). 

The Heisenberg equation of motion for the field operator is 

d+(p = T [<l>, H‘f] = -d-4> (3.12) 

and the Lagrange equation 

d. d+<f>--d-4> =0. (3.13) 

a J 

is recovered. 

The commutator (3.11) can be realized in momentum space through the following Fourier 
transform = r ) 

^(x+, X-) = -^ / difc+ fa(a:+, Jfc+)e-*^^*" + a+(x+, , (3.14) 

V27ry V2A;+ ^ 

if the operators a and are assumed to satisfy the equal-r canonical commutation relations, with 
the nonvanishing one given by [a(a;'‘", A;'^),a'^(x‘^,p'^)] = S{k'^ — p"^). On using the equation of 
motion (3.12) we derive easily 

aix+,k+) = e-‘*'*^a(fc+), at(x+,fc+) = e’^"*^at(fe+). (3.15) 

where we set 

fc" = implying 2k'^k~ = — (fc"^)^. (3.16) ’ 

a OL 

The dispersion relation for the free FJ chiral boson is different from that for a firee scalar particle 
with (finite k'^ but) vanishing mass, except for when |a| oo. 

The Fourier transform now assumes the form 

4.{x^,x-) = ^j [a(fc+)e-^‘- +at(fc+)e^*-] . (3.17) 

where fc ■ x = A:“x+ + fc+x“ = A;+(x“ + x+/a) and the nonvanishing commutator satisfies 
[a(A;+),o+(p+)] =i5(fc+ -p^)- 



450 


Prem P Srivastava 


The components of the classical canonical energy-momentum tensor Tf^‘' following from the 
noncovariant Lagrangian density (3.10) are found to be 


— T" — ^ 2^++ 

a 


a 

Ot 


(3.18) 


The on shell conservation equations 


= 2{d^(t>) d- 


d-4' 

a 


= 0 


(3.19) 


may be easily checked. They allow us to define, if the surface integrals can be dropped, the 
conserved translation generators P* 


and 


P+ = y dx- : T++ : = j dx- : {d-^f : = j dk+9{k+) Nik+) {k+) 




P- = I dx- : r+" : = - P+ 

a 


(3.20) 

(3.21) 


where iV(fc+) = a^(fc‘^)a(fc“’") is the number operator and : : indicates the normal ordering. 
2 ,Prom (3.18) and in virtue of (T'+'" + T”~+) = 0 we may derive the following relation 


d+ [x-T^^ + X+T+''] -h [x-T--^ x+T— ] = 0. 


which is valid on shell. We may hence define another conserved generator 


M = X^P- + 


J dx- 


(3.22) 


(3.23) 


The generators M,P'^,P- form a closed algebra: [M,P'^] = [M,P~] = -iP-^ 

and [P'+'jP"] = 0. The operator M thus generates the scale (boost) transformations on P^ 
by the same amount which leaves P^ jP'^ invariant. The mass operator 2P'^P”, however, gets 
scaled and is not invariant under M, The usual (kinematical) Lorentz boost generator M'^- = 
— a;'^P“+ f dx'~ x'~T'^'^ has similar properties. It is, however, as seen from (3.19), is not conserved 
in the manifestly noncovariant model under consideration. The Lagrange equation is shown to be 
form invariant under the infinitesimal transformation [41, 45] (j) +a;"^/a)0_0 generated 

by M. 

In the limit when \a\ oo we find (j) -> <j>R{x~) while 0, which corresponds to the LF 

Hamiltonian of free massless scalar theory, Sec. (2.6). The field satisfies: [(I)r{x'^), (2/^)1 = 

—ie(x~ —y-)/4. The limiting case is thus seen to describe a right (moving) chiral boson theory 
with the Lagrangian density as given in (3.10). 

An alternative form of the Lagrangian density may also be employed in our context. We recall 
that in the quantization of gauge theory it is found useful (Sec. 5) to introduce an auxiliary field 
B{x) of canonical mass dimension two (in 3 4- 1 dimensions) and add {Bdy,A^ 4 aP^) as the gauge- 
fixing term to the Lagrangian density. In the two dimensional theory under consideration it is 
also possible to follow this procedure, since the corresponding B{x) field here carries the canonical 
mass dimension one. The discussion parallel to the one given above may thus be based also on the 
following [46, 47] Lagrangian density 

^ + V2B{x){d-<i>) + ^B{x)\ (3.24) 

Tbe elimination of the auxiliary field using its equation of motion leads to (3.10) and the conclusions 
reached are the same. 



Perspectives of Light-Front QFT 


451 


We make only brief comments on other models. Siegel’s [40] theory which employs 

£ = + B{x)ido4> - di<f>f (3.25) 

is afflicted by anomaly which is to be eliminated by the addition of a Wess-Zumino term. The 
resulting theory does not describe [48] pure chiral bosons since they are coupled to the gravity. In 
this model the auxiliary field carries vanishing canonical dimension and, for example, a term 
cannot be added without introducing the dimensionful parameters. 

The model based on the idea of implementing the chiral constraint through a linear constraint 
[42, 46], 

£ = (3.26) 

where is Lagrange multiplier field, does not seem to exhibit physical excitations [49]. note 
that the field B^ carries dimension one and that this is the usual procedure in the classical theory 
which, however, seems to break down at the quantum level. 

To summarize, the simple procedure of separating first the condensate variable, which in fact 
corresponds to a gauge-fixing condition needed on the phase space in the context of Dirac procedure, 
before applying the standard procedure itself, is found to be successful in describing [32, 10] the 
SSB, the phase transition in two dimensional scalar theory, the SSB of continuous symmetry in 
3+1 dimensional theory, in furnishing a new proof of the Coleman’s theorem, and in the description 
of (the tree level) Higgs mechanism. It is also found successful in showing [15, 16] transparently 
and economically the vacuum structures in the SM and CSM models as will be reviewed in Sec. 
5. The self-duality constraint in the interacting theory leads to the well known kink solution in 
the front form theory as well. The chiral boson theory discussion becomes transparent and the 
LF commutator does not conflicts with the microcausality, A transparent discussion of the chiral 
boson theory emerges in the context of the modified FJ model. 

We will next review the essentials of the LF quantization of the Dirac and Maxw^ell fields. 


4 LF quantized Dirac field 

4.1 Anticommutators 

On the LF there is a natural decomposition of the spinor space. The LF comonents [50] 7^, where 
7=^ = (7^ ±7^)/\/2 have the properties (7'^)^ = (7“)^ = 0, 7^7+ = 7-7^, 7"^^ = 7~> ^.nd 
== 21 . We may thus introduce the hermitian projection operators 

A± = -7^7^ = +7°7^, = A*, A+A- = A-A+ = 0, 7°A+ = A--(° (4.1) 

2 V2 

The corresponding ± projections of the LF Dirac spinor are = A^^ and ^ 

^ ^ = 0, etc. The matrix Eg = E^s = £7^7^, Es^ = J, 

which commutes with A=^ plays an important role on the LF and we note : (A"^ + A~) = J, 

(A+ - A-) = E375, 75^^ = ^3(1/^+ - and E37-LE3 = “-7’^- 
The action of the free Dirac field is [51] 

5 = y d^x^dx^ C where 

C = '^({{7*^5+ + 7“5- + 7^^±} - 

It shows that only the component carries kinetic term and the component is nondynamical. 
The variation of the action with respect to and leads to the constraint equation 

2id^ip^ = (m + f7'^5x)7’^^-h 


(4.3) 



452 


Prem P Srivastava 


and its conjugate, while for the dynamical component we obtain the equation of motion 

4d+ip+ = -(m + ^(m + i'y-^dxh'^ip-h, (4.4) 

after eliminating the dependent component Its right hand side may be simplified to 

2(-m^ + d^d-‘-)(lld-)ip+. The canonical Hamiltonian density is easily seen to be 'hj'^ = 
with V’- being a dependent field given by the constraint equation above. It 
is straightforward to verify that the equation of motion for the dynamical component in the 
quantized theory is recovered as an Heisenberg equation of motion if we postulate the following 
anticommutation relations, which are local in all the spatial coordiantes. 


= -^A'^S(x~ - y~)5^{x-^ - y-^), 

,x-^),ip+(T,y- ,y^)} = 0, {rpl{T,x~ ,x-^),-ilj+Hr,y~ ,y^)} = 0. (4.5) 

The same result is also derived if we follow the straightforward Dirac procedure as in the case of 
the scalar theory. No first class constraint, however, arises in the present case. The scale dimension 
of 'ipx is clearly [^+] = l/{Lxy/Iq)- It follows from (4.3) that 

{'4)-{t,x~ , x-^),ili+\T,y~ ,y^)} = 7^(m + i7-^aj.)7+e(a;“ - y-)5‘^(x^ ~ y^) (4.6) 

24v2 

4.2 LF Spinor in momentum space and its properties 

In order to write the Fourier transform we look for the complete set of linearly independent plane 
wave solutions of the free Dirac equation in the front form theory. For the massive field the signs 
of and p” are correlated. Choosing, say, p"^ > 0 the independent set of the plane wave solutions 
are and v{p)e^P'^ where the four-spinors u(p) and v(p) satisfy: = 0 and 

(m + 7^p^)u(p) = 0. We will make the phase convention such that v{p) = C^^'^u(pY^ the charge 
conjugate of u(p). 

A very useful form [15, 10] of the free LF four-spinor is given by 

ufr)(p) z=i N{p) ^\/2p'^A‘‘’ + (m + 7-^p_L)A”j (4.7) 

1 /o 

where the normalization is chosen as N(jp) = l/(\/2 p'^m) ", with m > 0 and p"^ > 0. The 
constant spinors which are also the spinors in the rest frame p = (m/\/2,m/\/5, 0-^), satisfy 
70uW = ilM, = with r = ±. The charge conjugate rest frame spinors satisfy 

^Oyr „ while 

= iV(p) j^v^p'^A"^ -f- (m — 7'^p±)A“'j (4.8) 

We note that 75tt^^^(p;m) = r?xfr^(p;— m) and 75ufr^(p;m) = t;fr^(p; — m) indicating the 
mass reversal property of 75 upto a phase factor. Also S3t/fr^(p;m) = ru^'^Yp'^,—p^;m) and 
F3t;fr^(p;m) = — r t;fr^(p"^, — p-*~;m). We do not introduce two spinors and work only with four- 
spinors and do not also employ any explicit matrix representation. 

We recall that the LF Spin operator for the massive as well as massless particles is defined 
(Appendix B) by J's = -W'^JP'^ where is the Pauli-Lubanski four- vector. It contains solely 
the LF kinematical generators and the following useful identity can be demonstrated [15, 10] 

J3(p) = e <-;V)(5 ip'+S2P^) = J 3 _ 1 {p^B.2-p^B,) 


(4.9) 



Perspectives of Light-Front QFT 


453 


where — {Ki + J2) and ^/ 2 B 2 = {K 2 — Ji) are the kinemetical boost operators on the LF 
in the standard notation. Applying it to the spin 1/2 case® we derive (J 3 = E3/2) 

J3(P) = 


J3(p)u(^^(p) = 

J3(p)t;M(p) = 

where r/2 = ±(1/2) are the projections of jf(p) on the 3-axis in the rest frame and we used 
— 7^p^) = (7 '‘'Px)S 3. The four-spinors are shown to satisfy the following orthogonality 
relations: 

u(^^(p)u(^^(p)=Srs, V^"^{p)v^‘'>(p) = -5rs, (p) = 0. (4.11) 

and the following completeness relations follow easily 

uW(p)uW(p) = i^^, ^ ^;fr)(p)^(’•)(p)= (4.12) 

r=+,- 

where ^ We also have the useful relations: (p) = p^u^'^^{p)u^^'^(p) and 






S3 


(r/2)u^’"^(p) 

-(r/2)v^'^(p 


(4.10) 


4.3 Fermion propagator 

The Fourier transform expansion of 'fp{x) over the complete set of linearly independent plane wave 
solutions constructed above may be written as 


i>{x) = 


-jL=y 



(p)e-fr.* + dt(r) (p)„(r) (p)eip.® j (4.13) 


where the 0(p'*‘) is necessarily present. For the dynamical component V’+ = it follows that 


tp+ix) = 




(4.14) 


It is straightforward to verify that the anticommutation relations (4.5) for the independent 
field operator are in fact satisfied if we assume the standard canonical anticommutators, 

with the nonvanishing ones given by: = SrsS^ip"^ - p*'^)S{p-^ — p'-^) and 

{dW(p),dtW(p')} = 5rs5(p+ -p'+)5^ip^ -p'-L). 

The A+ projections of our LF spinors are by construction very simple, ufr^+(p) = 
(-\/2 p''‘/to)^/^A+u^’'^; they are eigenstates of E3 as well. This is very convenient since on the 
LF V’+ component is the independent dynamical degrees of freedom while ip- may be eliminated, 
even in the interacting theory, making use of the constraint equation. The simplified structure of 
ip+ gives rise to appreciable simplifications in the context of LF perturbation theory, compensating 
to some extent for the the nonlinearity of the interaction found along the longituidinal direction 
x~. We have better control [17], say, over recovering the manifest rotational and even Lorentz 
covariance in the perturbation theory calculations if we use the LF four-spinor introduced above. 
The propagator for the spinor field ip+ also takes a very simple causal form on the LF, resembling 
the one of the scalar field. 


®For spin-l/2 case: Jj = Sj/2, Kj = where j = 1, 2, 3. 



454 


Prem P Srivastava 


The free propagator for the independent component in momentum space is easily derived 
using the above Fourier transform 

(0\T{Mx)tl;lsm\o) = 

(0| [e(r)V'+yi(a:)iA+B(0) -e(-r)V'+B(0)^+A(a:)] 0^ 

= ^ [0(r)e-’* - 0(-r)e'®^] (4.15) 

where ^4, 5 = 1,2, 3,4 label the spinor components. The only relevant differences, compared with 
the case of the scalar field, are, apart from the appearance of the projection operator, the absence 
of the factor {l(2q^) in the integrand, and the negative sign of the second term in the fermionic 
case. They, however, compensate and the standard manipulations to factor out the exponential 
give rise to the factor [^(^■^) + 0{—q'^)] which may be interpreted as unity in the distribution theory 
sense, parallel to what we find in the derivation of the scalar field propagator on the LF. Hence 

< 0TO4.)^<,(0))|0 >= (4-16) 

It may also be derived by functional integral method; we do have to take care of the second class 
constraint in the measure. The fermionic propagator here contains no instantaneous term usually 
encountered when doing the old fashioned perturbation theory and the integrand factor may also 
be expressed as « -h ie)] 7®. We verify that the propagator satisfies the 

equation for the Green’s function corresponding to the equation of motion of '0+. 

The momentum space representations of the currents and the components of the energy- 
momentum tensor are derived straightforwardly and they support the usual interpretation of 
and as the number operators. For example, for the canonical Hamil- 

tonian we find 

(fx-^dx~ : - d±dj_)~tp+ : 

= ^2 j d^pd^kQ(p^)9{k '^) : 

= E/ (4.17) 
where we use = 6rsl% <Pp = and : : indicates the normal ordering. 


4.4 Ts Symmetry. Chirality transformation on the LF 

The 75 transformation [52], ^0 75^ on the spinor field is associated with the mass reversal in the 

Dirac equation. It leaves the Dirac equation form invariant only when the mass is vanishing. On 
the LF we can construct a generalized Fs transformation which restores the form invariance even 
for the massive field. 

Consider the covariant vector and axial current densities defined by 7^ = and 75^ = 

07^750 respectively. The corresponding charge densities are defined on the LF by the -f- compo- 
nents of the currents 



07+0 = i/20^^0_|. 

07+750 r= a/20^'^E30+. 


(4.18) 



Perspectives of Light-Front QFT 


455 


The momentum space representations of the charges are easily derived 

Q = J (fx^dx- : j+ -Ylf [b^^^^(p)b^^^{p) - dt(’-)(p)dM(p)] 

T 

^ J d^x^dx- : j5+ ;= ^ Jd^pe{p+) (r) [b^^’'^ip)b^^^(p) + dt(’-)(p)d(^) (p)] (4.19) 

The charges Q and Qs commute with the LF Hamiltonian and are thus constants of motion. The 
former counts the fermionic number while the latter the twice the projection along the 3-axis of 
the LF spin operator Jsip) discussed above. 

iFrom the commutation relations of the field we derive [9] 

{^+,<3} = tjj+, 

{■tp-,Q} = tp-, 

{ip+,Qs} = 75t/'+ = A+75V’+, 

= A~^^(io/-‘-dA, + rn)'y'^(j5'tp+) jt'Ys'ip-. (4.20) 

The action of the infinitesimal generators on ip is 

5qiP = {ip,ieQ} = ieip, 

TTL 

Sq^iP = {ip,kQ5} = 1675 fp, (4-21) 

where we use (4.3) and (4.4). It is well known that the infinitesimal transformation with respect to 
Q is associated with the form invariance of the Dirac equation = 0 and its conjugate 

under the global phase transformations. This symmetry gives rise to the on shell conserved Noether 
vector current j^. 

The Dirac equation is form invariant under the 75 (or chiral transformations) only for the 
massless theory, when the axial current is also conserved at the classical level. Our discussion on 
the LF in the Hamiltonian formulation indicates that the Dirac equation is also form invaraint 
under the following nonlocal Fs transformation, defined by Fs 

i) Fs^, 

Fs = (4.22) 

This can be demonstrated, say, if we use of the (on shell) identity 

iij‘^dfi-m)'y5 =”75 (4-23) 

The on shell conserved current associated with the Fs symmetry, which holds for both the massive 
and massless fermions, is hence given by 

= ipj^Tsip = j5^ - mtpYjsj'^ T^ip, 

d^J>^ = 0, 

(4-24) 

The chiral charge associated with the F5 symmetry coincides with Qs and the generalized chiral 
transformation is ip e^°‘^^ip. 



456 


Prem P Srivastava 


4.5 Helicity Operator, LF Majorana and Weyl fermions 

The Fourier transform of the self-charge conjugate Majorana spinor field satisfying, = 

‘’pMci^), follows easily from (4.13) 


ipuix) 


V2 


{'lj){x) +‘lpc 


1 

vW 


^ f d^p-^dp^ 





(p)e-^P-=^ 


(4.25) 


where bM^’'^(p) = (h^’^^(p) + <i^'’^(p))/v^ and the nonvanishing anti-commutator is given by 

The chiral or y^-projections of the LF spinor axe shown to satisfy the following properties 
( f (p; m) and (p; —m)) 


along with 


Qj) =: 7\r(p) + (7 -^Px)A""| 

^ (p) = N{p) m -4 0 for m -> 0 

^ (p) = N(p) I^V^p'^A'^ - (7-^px)A”j 

= JV(p) m -4 0 for m -> 0 


75 


(I +r75) 


(p) 


= iV(p) A -4 0 for m -4 0 




w (p) 


(I + rjs) 


etc., and we note that [J3(p)5 75] = 0. In the massless limits m -> 0, the projections 
{I T 75 )^^^^(p)> (L ± 75)1;^^^ ip) vanish. Also, for example, the : 


limit 

nonvanishing one 


(4.26) 


(4.27) 


(4 28) 

is an eigenstate of 75 and Jzip) with the eigenvalues 1 and 1/2 respectively, while the other one 

- ^>^ nt-)(p) (4.29) 

has the corresponding eigenvalues given by -1 and -1/2. The explicit discussion here shows that 
on the LF the definition of the spin operator for the massive and massless cases gets unified. 

The Helicity operator h is defined by 


- it-P [SsP'+T^-P^tVsI 

h — - ^ 

2 |P| 2|P| 

which is not the same as the LF spin operator. 

For massless fermions it is easily shown that 

h(p)uW(p) = (I)u(r)(p) 

ft(p)uW(p) = -(^)t;W(p) 


(4.30) 


(4.31) 



Perspectives of Light-Front QFT 


457 


Experimental observations show that only the negative chirality, (I — 'y^)u^ ^ (p)/2, neutrinos exist. 
Neutrinos have helicity -1/2, antineutrinos helicity 1/2. There is no charge conjugation invariance 
if neutrinos have a definite chirality. The CP transformations of these spinors can be discussed 
as usual (Appendix B). The normalization factor in the massless case has to be redefined. The 
massive particle does not have Lorentz invariant helicity; in the rest frame of the particle there is 
no preffered direction in what to measure spin. 

4.6 Bilocal operators 

^From the anticommutators in Sec. 4.1 we may derive the (free theory) equal-r current commu- 
tation relations, for example, = 0. The commutators among the other components 

are derived straightforwardly. They involve bilocal operators [53] of the form with the 

nonlocality only along the longitudinal direction. In the context of the deep inelastic scattering 
limit they are found relevant in the hadron tensor and the explanation of the Bjorken scaling 
and the introduction of the parton model of Feynman. Similar bilocal operators appear also in 
bosonic theories, for example, in the LF quantization [54] of Chern-Simons systems. We recall 
(Sec. 1) that on the LF nonlocality in the direction does not conflict with the microcausality 
principle. The bilocals have also been shown useful recently, for example, in the context [55] of the 
dynamics of hadrons in two dimensions and in revealing the string like structure in QCD 2 ‘ 


5 LF quantization of Gauge theory 

In perturbative QCD we employ, in the interaction representation, the free abelian gauge theory 
propagator. It is customary on the LF to adopt the light-cone gauge® A- = 0 which results in 
a simplified interaction Hamiltonian. The noncovariant gauge, however, introduces in the theory 
undesireable features. The rotational invariance becomes very difficult to track down making the 
comparasion with the conventional theory results sometimes extremely difficult. In the frequently 
employed old fashioned perturbation theory computations it is sometimes not easy to see if the 
conventional and the front form theories are really in agreement [56]. The LF quantized QCD was 
recently studied [17] in covariant gauges in the context of the Dyson-Wick perturbation theory 
expansion based on the LF-time ordered Wick products. Here all the relevant propagators become 
causal and the rotational invariance is easily recovered, when the LF spinor (4.7) introduced in the 
Sec. 4 is employed. The loop integrals can also be converted [18] to the Euclidean space integrals 
and the dimensional regularization may be used. 

The Lagrangian density for the Abelian gauge theory written in LF coordinates is 

i [(F+_)2 - {Fx2f + x] + B{d+A. + d.A+ + d^A^) + (5.1) 

where = {d^A^ - dvAf^). The covariant gauge-fixing is introduced by adding to the Lagrangian 
the linear gauge-fixing term Bd^A>^ 4- (^/2)B^ where B is the Nakanishi-Lautrup auxiliary field 
and ^ is a parameter. The canonical momenta are 7r+ = 0, ttjs = 0, tt-*- = F_x, = F+- + B 
and the canonical Hamiltonian density is found to be 

= h-n-f + 1(Fi 2)2 - ^+(a_7r- + - 2d-B) - B{-k- + dx.A^) -b i(l - (5.2) 

Following the Diran procedure, the primary constraints are 7r+ « 0, xb « 0 and = tt-*- - 9_Ax + 
d±A- « 0, where J.= 1, 2 and « stands for weak equality relation. We now require the persistency 
in T of these constraints employing the prehminary Hamiltonian, which is obtained by adding to 
the canonical Hamiltonian the primary constraints multiplied by the Lagrange multiplier fields. We 

®See the discussion below on the LF quantized two dimensional SM where this gauge is not convenient to employ 
if we are seeking for nonperturbative effects in the theory. 



458 


Prem P Srivastava 


assume the standard Poisson brackets for the dynamical variables in the computation for obtaining 
the Hamilton’s equations of motion. We are led to the following two secondary constraints 

$ = d-7c~ + - 2d-B ^ 0, 

^ = 7c- +2d-A+ + d±A^ ^ 0. (5.3) 

The Hamiltonian is next enlarged by including these additional constraints as well. The procedure 
is repeated. No more constraints are seen to arise, we now go over from the standard Poisson 
brackets to the Dirac brackets, such that inside them we are able to substitute the above con- 
straints as strong equality. The equal-r Dirac bracket {/(a;),g(2/)}jD which carries this property is 
constructed straightforwardly. Hamilton’s equations now employ the Dirac brackets and the phase 
space constraints tt"^ = 0, tt^ == 0, t/ = 0, $ = 0, and $ = 0 then effectively reduce the (extended) 
Hamiltonian. In the covariant Feynman gauge with f = 1 the free Hamiltonian takes the simple 
form 

j d^x^dx" g^^A^d^dj_Au- (5.4) 

The theory is canonically quantized through the correspondence i{fix),g{y)}D [/(^)» ^(y)]? 
commutator among the corresponding operators. 

The equal-r commutators of the gauge field are found to be 

[A^ix), = -^ig^^Kix, y) (5.5) 

where K{x,y) = -{l/4:)e{x~ - y~)5‘^{x-^ — y-^) is nonlocal in the longitudinal coordinate. The 
transverse components of the gauge field have the physical LF commutators [A}_{x)^A±_>{y)]^ = 
K(x,y) , while for the ± components we have only the mixed commutator nonvanishing 
[A^{x)^A^(y)]^ = -iK{x,y)j it has a negative sign which indicates the presence of unphysical 
degrees of freedom in Feynman gauge. For ^ # 1 the commutator, for example, of A± with A± is 
found to be nonvanishing. We note that the dimension of the gauge field is [A^] = 1/Lj_. 

^From the discussion analogous to that given in Sec. (2.6) for the scalar field it is clear, 
from the primary constraints, e.g., x'^ ^ ftie discussion here, that there are also first class 
constraints present in the gauge theory. They may be taken care of like in the case of the scalar 
theory. In the context of perturbation theory we may possibly ignore the zero-longitudinal-mode 
of the components of the gauge field. However, when dealing with nonperturbative effects they 
may not be ignored. For example, in the discussion of the (nonperturbative) vacuum structure of 
the completely soluble QED 2 (SM ) theory the zero-momentum mode of plays a crucial role 
together with the bosonic condensate variable (Sec. 6). 

The Heisenberg equations of motion lead to d^A^j, — 0 for all the components, and consequently 
the Fourier transform of the free gauge field over the complete set of plane wave solutions takes 
the following form on the LF 

A>^{x) = I + (5.6) 

w^here A = -h, 1, 2 label the set of four linearly independent polarization four- vectors. 

In the front form theory the two transverse (physical) polarization vector are space-like as 
usual while^^ the other two are null four-vectors. For a fixed fc), where = |fc|, we may 

construct them as follows 

e(+) = (l,fc/A°)/V2, = {l,-k/k°)/V2, = {0,e{k;l)), = {0,€{k;2)). (5.7) 

Here (0, 1, 2,3) components are specified for convenience while e{k-, 1), €{k; 2) and k/\k\ constitute 
the usual orthonormal set of 3-vectors with the associated completeness relation. The polariza- 
tion vectors are orthonormal: gf,^e^^^'^{k)e^’^'>‘'{k) = and satisfy the completeness relation: 
9\ae''^\{k)e^‘'K{k) = g^u- 

l(i) is called the dual of et+l(A:). Such a pair of null vectors is employed also in the well known ML 
prescription in the light-cone gauge and in the context of CNPA [20, 21]. 



Perspectives of Light-Front QFT 


459 


The field commutation relations for the gauge field found above are shown to be satisfied if 
we assume, parallel to the discussion in the fermionic case, the canonical commutation relations: 

-Qxer S{k'^ - k''^) - k'^). We note that the operators 

Q(o) = (a(+)+a(_))/>/2 anda(3) = (a(+)-a(_))/\/2 obey the usual canonical commutation relations 
except that in the case of a(o) a negative sign is obtained. The discussion of the Gupta-Bleuler 
consistency condition then becomes parallel to that in the conventional equal-time treatment of 
the theory. 

The Feynman gauge free gauge field propagator on the LF can be derived straightforwardly 

< 0|r(A.(x).4„(0))|0 >= ^ e-“-' (5.8) 

The momentum space representations of the components of the energy-momentum tensor are 
straightforward to derive as in the fermionic Ccise. The canonical Hamiltonian, for example, gets 
contributions from the physical transversly polarized photons as well as from the longitudinally 
polarized ones. The Gupta-Bleuler consistency condition is required [17] to be imposed in order 
to define the physical Hilbert space. 

The computations done [17], employing the covariant gauge on the LF, for the electron self- 
energy, electron-muon scattering, and the Compton scattering demonstrate complete agreement 
with the results known in the conventional equal-time theory. We find that on the LF the tree 
level seagull term dominates the (classical) Thomas formula for the scattering at vanishingly small 
photon energies. It is suggestive that on the LF the (conventional theory) semi-classical approxi- 
mation may reveal itself already at the tree level (after having removed the constraints). We will 
consider the LF quantized QCD after the study in the front form theory of the nonperturbative 
vacuum structures in some two dimensional completely solvable gauge theories. 


6 Vacuum Structures in Schwinger cuid Chiral Schwinger 
Models 

It is pertinent to study two dimensional gauge theories on the LF. The models like SM and CSM 
can be solved completely. They may give clues, for example, on the accessibility or not, in the fully 
interacting theory, of certain gauge-fixing condition, found practical in the context of perturbation 
theory. The study [15] of the SM, for example, shows that the light-cone gauge, .4_ = 0, is not 
convenient on the LF; it would subtract out the gauge invariant information from the theory itself, 
which is needed for describing the nonperturbative vacuum structure in the theory. 

The models mentioned above are known to have non-trivial vacuum structure, a non- 
perturbative efi’ect, from the studies [57] in the conventional framework. Their study would indicate 
as to how to look for such and other nonperturbative effects in the LF quantized QCD in 3 4- 1 
dimensions. 

The massless QED 2 or SM is describe by 

L = ( 6 . 1 ) 

Its exact solvability [38] derives from the remarkable property of one-dimensional fermion systems, 
viz, that they can equivalently be described in terms of canonic^ one-dimensional boson fields. 
Some of the correspondences in the abelian bosonization are fpxjj = K : cos2v^<^ ^75^ = 
K : sm2^/^(j> :, ^757/5^ = ■ipi'f.dil; where 4) is e. 

bosonic scalar field and FT is a constant. The fermionic condensate < rprp >o, for example, may 
then be expressed in terms of the value of the bosonic condensate. The bosonized theory can 
also be constructed with the use of the functional integral method. The original fermionic and 
the bosonized theories are equivalent in the sense that they have the same current commutation 
relations and the energy-momentum tensor is the same when expressed in terms of the currents. 



460 


Prem P Srivastava 


For studying nonperturbative vacuum structure the bosonized theory is convenient to use. The 
bosonized version of QED^ is found to be 

£ = ( 6 . 2 ) 

where g = It carries in it all the symmetries of the original fermionic model including 

the information on the dynamical mass generation [38] for the gauge field. Under the U(l) gauge 
field transforamtion the scalar field is invariant (or neutral) while under the chiral transformations, 
?75(1), in view of the correspondences above, the field suffers a translation by a constant. 

Following the procedure of Secs. 2 and 3 we malce the separation, of the condensate variable 
in the scalar field : (j){r,x'^) = a;(r) 4- (p{T,x~). The chiral transformation is now defined by: 

u u + const., (p (p, and Af^ A^j, so that the boundary conditions at infinity on the 
quantum fluctuation field (p are kept unaltered under these transformations and the mathematical 
framework be considered well posed. The bosonized Lagrangian written in the LF coordinates 
reads as is rewritten as 

L= j dx~ + g{A+ip' - A-ip) + ^(A_ - A+')^] - gdihir) (6.3) 

where h{r) = f dx'^A^{r,x~), an overdot (a prime) indicates the partial derivative with respect 
to T (x~). We work in the continuum and require (on physical considerations) that the relevant 
fields satisfy the necessary conditions such that their Fourier transforms with respect to the spatial 
longitudinal coordinate x~ exist. 

The last term in the Lagrangian density shows that the light-cone gauge, A_ = 0, employed 
often in perturbation theory computations, may not be appropriate to use in the fully interacting 
theon/-^, if we are seeking to study also the nonperturbative effects in the theory. Also the zero- 
momentum mode of A_ is a gauge invariant quantity under the boundary conditions. assumed. 
We may, of course, impose different boundary conditions on the fields or add new ingredients in 
the theory so as to compensate for the elimination of the physical dynamical variable h{r). A 
convenient alternative is the local gauge-fixing condition = 0, which is accessible on the 

phase space. We remove only the nonzero modes of A_. 

Following the Dirac method to eliminate the constraints in the front form theory only the three 
linearly independent operators survive: the condensate u, h(r), the zero-momentum-mode of A_ 
and canonically conjugate to oj as well, and cp which satisfies the LF commutator while it commutes 
with the others. The contains in it only the field p. The Hilbert space can thus be described 
in two different fashions. Selecting p and h as forming the complete set of mutually commuting 
operators leads to the chiral vacua while selecting p together with to leads to the description built 
on the condensate or 0-vacua. In the QED 2 the w is not a background field rather it is shown [15] 
to be an operator and its eigenvalues, with continuous spectrum, label the condensate vacua of the 
theory. The cluster decomposition property requirement [23] indicates the preference in favor of 
the condensate vacua. 

The other related gauge theory model is the chiral QED 2 or CSM described by 

F- = + jpR + ■ipL + 2ev^A^)V’i, (6.4) 

where^^ if = rpR + is a two-component spinor field and A^ is the abelian gauge field, 
a^d The classical Lagrangian is invariant under the local U(l) gauge 

transformations A^ 4- d^j,al{2\/Tre), ^ — )► [Pi^ 4- e^^Pj,]^ and under the global U(1)b chiral 

transformations ^ exp{ijB(^) 

^ ^Similar considerations are clearly pertinent to 3 -f 1 dimensional QCD as well. 

^^In two dimensions the ± projections of the spinor coincide with the chiral or 75 projections. We define 7^^ = <Ji, 
7 ^ = *>2, 75 = 7O7I = -o-s, A' = 7 ° 7‘/\/2 = (1 - 75)/2 = Pl, A+ = 7S+/V2 = (1 + 75)/2 = Pr, 
= x) with V 2 x^ = V^x^: = - A:^ = {A^ ± A^)/y/ 2 , = Pl^r = 



Perspectives of Light-Front QFT 


461 


The bosonized version is convenient to study the vacuum structure; it is shown to be given by 

S = j(fx + ^ae‘^A^A>^ (6.5) 

Here the explicit mass term for the gauge field parametrized by the constant parameter a represents 
a regularization ambiguity [58] and the breakdown of U{1) gauge symmetry. The model has received 
much attention since Jackiw and Rajaraman [58] pointed out that, despite the gauge anomaly the 
theory can be shown to be unitary and consistently quantized. In the LF coordinates it reads as 

S= f (fx + hi- - A'f + ae2[A+ + — (w + >P)]aJ\ . (6.6) 

J I I ae \ 

We note now that A+ appears in the action as an auxiliary field, without a kinetic term. It is 
clear that the condensate variable may thus be subtracted out from the theory using the frequently 
adopted procedure of field redefinition [16] on it: A+ - 2d;/(ae), obtaining thereby 

^CSM = w' + ^ 2e^.4_ + ae^A+A-., (6.7) 

which signals the emergence of a different structure of the Hilbert space compared to that of the 
SM. 

The Lagrange equations in the CSM follow to be 


d^d^<p 


ae^A^ -h 2ed^(p, 
ae^A-. 


d^Gir,x) = 0 


and for a 1 they lead to: 


where E = (5+ A- — a_ A-j-) and G = {E — aeip). Both the massive and massless scalar excitations 
are present in the theory and the tachyons would be absent in the specrtum if a > 1; the case 
considered in this paper. We will confirm in the Hamiltonian framework below that the E and G 
represent, in fact, the two independent field operators on the LF phase space. 

The Dirac procedure as applied to the very simple action (6.7) of the CSM is straightforward. 
The canonical momenta are 7r+ « 0,7r“ = E = A_ - A!^,% =</?'+ 2eA« which result in two 
primary weak constraints tt”^ « 0 and Cli = (tt^ — - 2eA_) « 0. A secondary constraint 

0,2 = d^E + ae^A- » 0 is shown to emerge when we require the r independence (persistency) of 
TT"^ » 0 employing the preliminary Hamiltonian 

H' = + J dx uiOi, (6.10) 

where u+ and ui are the Lagrange multiplier fields and is the canonical Hamiltonian 

Jljf = J dx EA'+ - ae'^A+A- . (6.11) 

and we assume initially the standard equal-r Poisson brackets : {E'^{T,x~),A^{T,y~)} = 

-5l^S(x~ - y~), {•rr^{r,x-),tp{r,y~)} = -S{x~ - y~) etc.. The persistency requirement for fii 
results in an equation for determining uj . The procedure is repeated with the following extended 
Hamiltonian which includes in it also the secondary constraint 




dx A dx uiOi + dx U 2 fi 2 * 



462 


Prem P Srivastava 


No more secondary constraints are seen to arise; we are left with the persistency conditions which 
determine the multiplier fields ui and U 2 while Uj^. remains undetermined. We also find^^ (C)ij = 
{Quflj} = Dij {-2dJ{x-y)) where = 1,2 and Du = 1, D 22 = D 12 = D 21 = -e 

and that has vanishing brackets with The « 0 is first class weak constraint while Qi 
and fi 25 which does not depend on or tt"*”, are second class ones. 

We go over from the Poisson bracket to the Dirac bracket {, }£? constructed in relation to the 
pair, fix 0 and fi 2 ^ 0 


{fi^), 9 iy)}D = {fix),9iy)} - j j dudv {f{x),ni{u)}{C ^{u,v))ij{nj{v),g{y)}. (6.13) 

Here C~'^ is the inverse of C and we find iC~^{x^y))ij = Bij K{x,y) with Bn = a/(a — 1 ), B 22 = 
l/[(a - Bi 2 = B 2 i= 1/[(o - l)e], and K{x^y) = — e(x - 2 /)/ 4 . Some of the Dirac brackets 
are {(P,(p}d = ^11 K(x,y); {(P,E}d = eBn K(x,y); {E,E}n = ae^Bn K{x,y)] {ip,A^}D = 
-B 12 5{x - j/)/ 2 ; {A^^E}d = Bii 5{x - y)l2\ = Bi 25 a; 6{x - 2/)/2 and the only 

nonvanishing one involving A^ or is {A^,7r'^}D = 5 (x — y). 

The equations of motion employ now the Dirac brackets and inside them, in view of their very 
construction, we may set Qi = 0 and ^2 = 0 as strong relations. The Hamiltonian is therefore 
effectively given by He with the terms involving the multipliers ui and U 2 dropped. The multiplier 
u+ is not determined since the constraint « 0 continues to be first class even when the above 
Dirac bracket is employed. The variables and A^ are then removed from the theory leaving 
behind B, A+, and tt"^ as the remaining independent variables. The canonical Hamiltonian 
density reduces to T-L[^ = E^ jl A d^{Aj^E) while A+ = {A+,HI^}d = “U-f. The surface term in 
the canonical LF Hamiltonian may be ignored if, say, B(= F+«.) vanishes at infinity. The variables 
TT"^ and A 4 . are then seen to describe a decoupled (from tp and E) free theory and we may hence 
drop these variables as well. The effective LF Hamiltonian thus takes the simple form 

= (6-14) 

which is to be contrasted with the one found in the conventional treatment [57]. E and G (or 
E and cp) are now the independent variables on the phase space and the Lagrange equations are 
verified to be recovered for them, which assures us of the selfconsistency [25]. We stress that in 
our discussion we do not employ any gauge-fixing. The same result for the Hamiltonian could 
be alternatively obtained^^, however, if we did introduce the gauge-fixing constraint A^ w 0 and 
made further modification on {, }£> in order to implement ^ 4 . » 0, tt"^ 0 as well. That it is 
accessible on the phase space to take care of the remaining first class constraint, but not in the 
bosonized Lagrangian, follows from the Hamiltons eqns. of motion. We recall [15] that in the SM 
(p, ijjy and TTu, = (6/\/7r) f dxA^ were shown to be the independent operators and that the matter 
field ip appeared instead in the LF Hamiltonian. The canonical quantization is peformed via the 
correspondence i{fjg}D [/j^] and we find the following equal-r commutators 

[Eix),E{y)] = iK(x,y)a^eyia-l), 

[G(i),£;(j,)] = 0 , 

[G(a;), G(j/)] = ia^e^K{x,y). (6.15) 

For a > 1, when the taxiyons axe absent as seen from ( 6 ), these commutators are also physical 
and the independent field operators E and G generate the Hilbert space with a tensor product 
structure of the Fock spaces Fb and Fq of these fields with the positive definite metric. 

make the convention that the first variable in an equal- t bracket refers to the longitudinal coordinate 
x~' = X while the second one to = v while r is suppressed. 

similar discussion is encountered also in the LF quantized Chern-Simons-Higgs system [54]. 



Perspectives of Ligbt-Pront QFT 


463 


The commutators obtained can be realized in the momentum space through the following 
Fourier transforms 

if the operators {d,g,d^,g^) satisfy the well known canonical commutation relations of two inde- 
pendent harmonic oscillators; the well known set of Schwinger’s bosonic oscillators, often employed 
in the angular momentum theory. The expression for the Hamiltonian becomes 

Hc^sm = ( 6 - 17 ) 

where we have dropped the infinite zero-point energy term and note that [22] [S{k^T),d{l,T)] = 
—5{k — Z), S{k^r)d{k,T) = 5{0)Nd{k,T) etc. with similar expressions for the independent g- 
oscillators. We verify that [iVd(A;,r),iVd(/,r)] = 0, [Nd{k,T),Ng{l,T)] = 0, [Wd(^, r),d^(A:,r)] = 
d^{k^r) etc.. 

The Fock space can hence be built on a basis of eigenstates of the hermitian number operators 
Nd and Ng. The ground state of CSM is degenerate and described by |0 >= |E = 0) 0 \G} and it 
carries vanishing LF energy in agreement with the conventional theory discusion [57]. For a fixed k 
these states, = 0) 0 {g^(k, T)”'/\/n[)|0}, are labelled by the integers n = 0, 1, 2, • • -. The ^-vacua 
are absent in the CSM. However, we recall [15] that in the SM the degenerate chiral vacua are also 
labelled by such integers. We remark also that on the LF we work in the Minkowski space and that 
in our discussion we do not make use of the Euclidean space theory action, where the (classical) 
vacuiun configurations of the Euclidean theory gauge field, belonging to the distinct topological 
sectors, are useful, for example, in the functional integral quantization of the gauge theory. 

On the LF both the bosonized SM and CSM are described in terms of a minimum number of 
dynamical variables, which survive after the elimination of the phase space constraints. We recall 
that the introduction of the bosonic condensate variable uj{t) (or in general a;(r, rr -^) ) corresponds 
to the gauge-fixing required in order to deal with the first class constraint /(tt — d^</>) « 0. On 
the other hand we have the gauge invariant zero-momentum mode /i(r) of the gauge field 
apart from the quantum fluctuation field cp. They are in a sense the minimal set of operators 
which survive in the front form theory. With their help the vacuum structures of both the SM and 
CSM are described in a very economical and transparent fashion on the LF, which agree with the 
conventional theory conclusions. In the latter, however, we have to go through quite an elaborate 
and extensive discussion [57]. Finally, if we did adopt the light-cone gauge we must compensate 
for the loss of the gauge invariant information by some other ingredient, say, by imposing more 
complicated boundary conditions on the fields involved or by introducing new fields. 

7 QCD in Covariant gauges 

We describe briefly the recent study [17] done on the front form QCD in covariant gauges. The 
Lagrangian density corresponding to the quantum action of QCD is described in standard notation 
by 

Cqcd = (7-1) 

4 ^ 

Here ifF is the quark field with color index j = l-JVc for an SU (Nc) color group, .4“^ the gluon field, 
= df,A^ - duA°-^ + g}°-^‘^A^^A'^u the field strength, = 

- i5A‘*p(A“/2)y)^, a = l..{Nc^ - 1) the gauge group index etc. The covariant gauge- 
fixing is introduced by adding to the Lagrangian the linear gauge-fixing term (B“9^24“^ + 



464 


Prem P Srivastava 


where is the Nakanishi-Lautrup auxiliary field and ^ is a parameter. The xi"" and X 2 '' are the 
(hermitian) anticommuting Faddeev-Popov ghost fields, and the action is invariant under the BRS 
transformation. 

The quark field term in LF coordinates reads 


-h 'ipL s_ — 


(7.2) 


where = (d*^<9± - igA°'±(X^/2y^), It shows that the minus components 'ip-^ are in fact 
nondynamical ( Lagrange multiplier ) fields without kinetic terms. The variation of the action 
with respect to 'ip^_ and ip^^ leads to the following gauge covariant constraint equation 


(7.3) 

and its conjugate. The 'ip^^ components may thus be eliminated in favor of the dynamical com- 
ponents 


'ip^^{x) = 




U-\x\A.)^U{x\A-) 


-ijk 




(7.4) 


Here, for a fixed t, U = U{x\A-) is an JVc x JVc gauge matrix in the fundamental representation 
of SU{Nc) and it satisfies 

d-Uix\A-) = -igU{x\A-)A-{x) (7.5) 

with A- = A'*_A“/2. It has the formal solution 


U{x jX-^l^-) = f/(i o,x'^\A-)Pexp 


•/X 0 


ig / dy A-{y 


- 


(7.6) 


where V indicates the anti-path-ordering along the longitudinal direction a;"*. U has a series 
expansion in the powers of the coupling constant. 

The Hamiltonian density in Feynman gauge is 

'Hint 

= - \9’^''A\ A% - gV2^ij°A+'^i/;+^ 

- [5'^'(i7^5x - m) + - d,A%)A'»^A'^'' 

2 

+ + p/“'"(a''x“)/^% (7-7) 

where is given above, we have set y/2x = (xi + iX 2 ), y/^X = (Xi “ *X 2 ), and in the 
cubic and higher order terms belong to Hint which is also understood to be normal ordered. It is 
worth remarking that despite the presence of the longitudinal operators a± and a^± in the fields 
Aft, there are no non-zero matrix elements involving these quanta as external lines in view of the 
commmutation relations of these operators as discussed in the previous section. 

The perturbation theory expansion in the interaction representation where we time order with 
resi>ect to the LF time r is built following the Dyson- Wick procedure. We will illustrate it 
in our context through some explicit computations, for simplicity, in QED where U{x\A-) = 

eip{-te/® du~j4_(T, u“,x-“-)} and = (5^ - ieA^). We observe that a seagull term of the 
order is present in the interaction Hamiltonian at the tree level; like that found also in the scalar 
field QED. 



Perspectives of Light-Front QFT 


465 


Towards an illustration consider the computation of Electron Self-Energy. The contribution 
from the longitudinal components arises from 




/ OO 

-dy2~e{x2~ - yi~) 

rX 2 ~ 

{ du2~A-{u2)}{m-ij92^)‘ip+{y2)^\.(xi)i;+{xi)A+(xi): ( 7 . 8 ) 


leading to 


, f ^ u(’')(p)[7 (m+ ^^)7 +]u^*^Cp) 

J ^ [(jP- 9 )^ + -m^ + it) 


{-9-+) 


( 7 . 9 ) 


The graph with the A+ and interchanged gives rise to a similar expression with -f 
while 7^ The matrix elements following from the four graphs corresponding to the exchange 

of the ( photon ) fields Ai and A 2 is also written down by simple inspection. As in the earlier case 
the expressions get simplified in virtue of (10) and acquire the covariant form encountered in the 
conventional covariant perturbation theory. The complete matrix element is found to be 


2 f ^ u(»-)(p)[7^(m+ 

y [(p-g)2 + ie](g2-m2 + ie) 


(~9iiv) 


( 7 . 10 ) 


where = ((m^ + and the integration measure is d^q = (Pq^dq'^dq~. 

Considering that the integrand has the pole at « 0 we may regard the expression obtained 

[17] on the LF to be effectively identical to the one obtained in the conventional covariant theory 
framework. The discussion parallel to that given here may be followed also in the context of the 
light-cone gauge. The latter, however, demands the further introduction [59] of a light-like vector 
= (n®, n) and its dual = (n°, -n) in order to evaluate the corresponding Fe3mman integrals 
in a consistent manner. 


8 Conclusions 

Collected below are some of the interesting conclusions we seem to reach. 

• The LF hyperplane is equally valid and appropriate as the conventional equal-time one for 
the field theory quantization. 

• The appearence of the nonlocality along the longitudinal direction in the front form quan- 
tized theory is not unexpected; it does not conflict with the the microcausality (or cluster 
decomposition) principle. 

• The covariant phase space and Fourier expansion considerations based on the description of 
the relativistic theory using light-cone coordinates lead to the LF commutator for the free 
scalar field, which is nonlocal in the longitudinal direction. 

• The hyperplanes = 0 define the characteristic surfaces of a hyperbolic partial differential 
equation. From the mathematical theory of classical partial differential equations [24] it is 
known that the Cauchy initial value problem would require us to specify the data on both 
the hyperplanes. ^From our studies we conclude [16] that it is sufficient in the front form 
theory to choose one of the two LF hyperplanes for canonically quantizing the theory. 

- In the quantized theory the equal-T commutators of the field operators, at a fixed initial 
LF-time, form now a part of the initial data instead and we deal with operator differential 
equations. 

- The information on the commutators on the other characteristic hyperplane seems already 
to be contained [15] in the quantized theory; it may not, in general, be required to specify it 
separately. 



466 


Prem P Srlvastava 


• The constrained phase space dynamics in the LF theory with one more kinematical generator 
and the inherent symmetry with regard to result in a reduced number of independent field 
operators. The discussion of the Hilbert space becomes more transparent compared to that in 
the conventional treatment. The lack of manifest covariance which appears problematic can 
be handled^^ by employing, for example, the LF four-spinor [15] and the Fourier transform 
of the spinor field as defined [17] in Sec. 4. 

• On the LF the 75 symmetry of free massless Dirac equation can be generalized to a nonlocal 
(chiral) Fs symmetry valid also in the massive case. The Weyl and Majorana spinors and 
the helicity operator may be defined on the LF in straightforward fashion. 

• The zero-longitudinal-momentum modes of the fields are important for describing the non- 

perturbative effects on the LF. In the scalar and gauge theories they are dynamical vari- 
ables in the frame work of the standard Dirac procedure. The separation = 

+ <^(r,x“,x-^) introduced in Secs. 2.6, 5 correspond to the gauge-fixing conditions 
[25] required to be introduced in the theory for handling first class constraints. 

- In the case of the scalar theory we obtain constraint equations which enable us to describe 
SSB and the (tree level) Higgs mechanism. Associated to the local theory in the conventional 
coordinates we find a nonlocal LF Hamiltonian. 

- The gauge field zero modes play a crucial role in the description of the nonperturbative 
vacuum structures in the LF quantized SM and CSM. They also indicate that the (popular) 
light-cone gauge may not be accessible in the front form theory if we are concerned with the 
study of nonperturbative effects. 

• The physical content following from the front form theory is the same, even though arrived 
at through different description on the LF, when compared with the one in the instant form 
case. 

• Not all the constraints in the LF theory need to be solved first before considering its renor- 
malization; it is sometimes convenient to obtain some of them as renormalized constraint 
equations [18] instead. 

• In the conventional treatment we may be required to introduce external constraints in the 
quantized theory based on physical considerations, say, while describing the SSB. The anal- 
ogous relevant constraints in the front form theory appear to be already contained in the 
quantized theory. 

• On the LF the quantized theory of chiral boson appears straightforward (Sec. 3.4). The field 
commutator does not conflict with the microcausality principle. 

• A theoretical demonstration of the well accepted notion that a classical model field theory 
must be upgraded first through its quantization before we confront it with the experimental 
data, seems to emerge. 

• The LF quantized QCD employing covariant gauges [17] looks promising. All of the propa- 
gators become causal and the covariance of the theory is tractable. The semiclassical theory 
is found revealed at the tree level. The algebra of bilocals in the LF quantized theory may 
help reveal the string like structure as seems to be found [ 55 ], for example, in QCD 2 . 

• The recently proposed BRS-BFT [60] quantization procedure is extended straightforwardly 
on the LF (Appendix C). 

• It is well known that topological considerations are often required in the field theory quan- 
tization employing the functional integral method, where the Euclidean theory action is 
employed. The corresponding ingredients seem to arise in the canonically quantized front 


also, [54] 



Perspectives of Light-Front QFT 


467 


form theory as well but with different interpretation. This is suggested, for example, from 
the studies of the SM, CSM, and the study of the kink solutions. 

• In connection with the relativistic bound state problem, not touched upon in this article, the 
LF Tamm-Dancoff method [9, 56] and Bethe-Salpeter dynamics on the covariant null plane 
[19, 20, 21] seem to be promising alternatives to lattice gauge theory approach. 


Acknowledgements 

The author acknowledges with thanks the helpful comments from Stan Brodsky, Richard Blanken- 
becler and Sidney Drell. The hospitality offered to him at the SLAC and a financial grant of 
Prociencia program of the UERJ, Rio de Janeiro, Brasil, are gratefully acknowledged. 


Appendix A; Poincare Generators on the LF 

The Poincare generators in coordinate system satisfy [M^u,P<t] = -iiPfiQua - 

Pu9fjL€F) and [M^j,,Mptr] = iiM^pQuc +• My^^g^p - Mj^pg^a - Mpa9up) where the metric is g^^, = 
dm^(l,-”l,-l,— l),/x = (0, 1,2,3) and we take €0123 = fi 2 = 1* If we define = -{lf2)eikiM^^ 
and Ki = Mqi^ where i,j,k,l = 1,2,3, we find [Ji,Fj] = itijkFk for Fi = Ji^Pi or Ki while 
\Ki^Kj\ = “ ^^Po9ii^ [-^tj-Fb] ^Pu and [Ji,Po] = 0- 

The LF generators are P+, P-, Pi, P 2 , M 12 = - J 3 , M+_ = -Kz, Mi_. = -{Ki -h J 2 )/V 2 = 
-Pi, M 2 - =-{K 2 - Ji)lV2 = -P 2 , Mi+ = J2)/V2 = -5i and M 2 + = -(7^2 + Ji)/\/2 = 

-S 2 . We find [Pi,P 2 ] = 0, [Pa, J 3 ] = -fCa&P 6 , [Ba.Kz] = iPa, [/3,i^3] = 0, [ 81 , 82 ] = 0, 
[5a, J 3 ] = [^ajATs] = —iSa where a, 6 = 1,2 and €12 = —621 = 1. Also [Pi, Pi] = 

[P2,P2] = iP+, [Pl,P2] = [P2,Pl] = 0, [Pa,P-] = fPa, [Pa,P+] = 0, [5i,Pi] = [52,P2] = tP", 
[Pl,P2] = [52 ,Pi] = 0, [Sa,P+] = iPa, [5a, P"] = 0, [Pi,52] = -[P2,52] = [Pi,5i] = 

[P2,52] = —iKz. For P^ = idp, and we find Ba = - x^P'^), 

Sa = (x”P“ - r'^P”), Kz = {x~P'^ - r'^P”) and J 3 = {x^P^ — x^P^). Under the conven- 
tional parity operation V: { x^ - 3 :^’^) and (p^ ^ -> we find 

J J, K -K, Ba —Sa etc.. The six generators Pi, Mki leave = 0 hyperplane invariant 
and are called kinematical while the remaining Po, Mok the dynamical ones. On the LF there 
are seven kinematical generators : P“^,P^,P^,Pi,P 2 , J 3 and Kz which leave the LF hyperplane, 
^0 _j _^3 _ invariant and the three dynamical ones Si , S 2 and P“ form a mutually commuting set. 
The Kz which was dynamical becomes now a kinematical; it generates scale transformations of the 
LF components of x^, P^ and We note that each of the set {Pi,P 2 , J 3 } and { 5 i, 52 , J 3 } 
generates an E 2 50(2) (g)T 2 algebra; this will be shown below to be relevant for defining the spin 
for massless particle. Including Kz in each set we find two subalgebras each with four elements. 
Some useful identities are_ P^ g-iwiTs ^ p±^ giujKs p± ^-tuKz _ p± giu.B p- ^-tv.B _ 

p- 4. |;.p + 3, p+^^iv,B p±^-iv.B ^ p± 4_ ^±p+^giu.^p+g-iu..§ ^ 

P+ + ^.P+ i^2p-^giix.5p~-g-it2.S ^ p-^^iu.§pL^^iu.S ^ p± +^-Lp- pi- = P = 

(P^,P^), = t; = (ui,U 2 ) and (u-^.P-*-) = (u.P) = uiP^ 4- t; 2 P^ etc. Analogous expressions with 

P^ replaced by can be obtained if we use [P^,Xt,] = [id^,Xu] = iS^ . 

Appendix LF Spin Operator. Hadrons in LF Fock basis 

The Casimir generators of the Poincare group are : P^ = P^P^ and where Wp = 
(— l/2)€Ap„pM''^^P^ defines the Pauli-Lubanski pseudovector. It follows from [Wp^W^] = 
i^fivXpW^P^, [WfjtyPp] = 0 and W.P = 0 that in a representation charactarized by particu- 
lar eigenvalues of the two Casimir operators we may simultaneously diagonalize P^ along with just 

^®See, P.P. Srivastava, Lightfront quantization of field theory in Topics in Theoretical Physics, Festschrift for Paulo 
Leal Ferreira, eds., V.C. Agnilera^Navarro et al., pgs. 206-217, IFT-S^ Paulo, SP, Brasil (1995); hep-th/96 10044; 
9610149. 



468 


Prem P Srivastava 


one component of We have W"" = J3P“+5iP^— 52^^, = 

jC^p^ -f. B2P^ - 52P"^, and = -[KzP^ + PiP^ - 5 iP"^] and it shows that W'^ has a special 
place since it contains only the kinematical generators [ 15 ]. On the LF we define Js = -W'^/P'^ 
as the spin operator. It may be shown to commute with P^,Bi,B2,h, and For m ^ 0 we may 

use the parametrizations : (p~ = = (m/\/2)e‘^,p^ = —vip^,p^ = -t;2P"^) 

and ^ : (1, 1,0,0) (m/\/ 5 ) in the rest frame. We have P^(p) = m?! and W(p)^ = W{pf = 
-h Jl + Jz] = -m^s(s + 1)/ where s assumes half-integer values. Starting from the rest 
state Ip; m, s, A, ..) with J3 |p; m, s, A, ..) = A |p; m, s. A, ..) we may build an arbitrary eigenstate of 
P'^jP'^, Ja (and P" ) on the LF by 

Ip+.p-*-; m, s, A, ..) = |p. s, A, ..) 

If we make use of the following identity [ 10 ] 

Jz{j>) = Jz+ viB:, - V2B1 = J3 

we find j^3 |p“‘",P’‘";m, s,A, ..) = A|p"‘",p-^;m,s, A, ..). Introducing also the operators Ja = 
—{JzP°‘ + W^^)/\/P^Pju5 = 1,2, which do, however, contain dynamical generators, we verify 

that [i^jv^'j ^ i^ijk^k' 

For m = 0 case when p"*” 0 a convenient parametrization is p^ : (p“ = p‘^t;-‘-^/ 2 , p'^jp^ = 
-uip'*',p^ = —V2p^) and p : ( 0 ,p"^, 0 -‘-). We have W^^ip) = —{Si + SDp"^^ and [Wi, W2](p) = 
0 , [W+, Wi](p) = ~ip+W2(p), [W+,W2](p) = ip+Wi{p) showing that Wi, W2 and W+ generate 
the algebra 50 ( 2 ) igi T2. The eigenvalues of are hence not quantized and they vary continu- 
ously. This is contrary to the experience so we impose that the physical states satisfy in addition 
Wi,2| p; m = 0 , ..} = 0 . Hence = —XP^ and the invariant parameter A is taken to define as the 
spin of the massless particle. ^Ftom —W~^{p)lp^ = J3 we conclude that A assumes half-integer 
values as well. We note that = A^P^Pp = 0 and that on the LF the definition of the spin 

operator appears unified for massless and massive particles. A parallel discussion based on p” 7^ 0 
may also be given. 

As an illustration consider the three particle state on the LF with the total eigen- 
values p“^, A and p-^. In the standard frame with p*^ = 0 it may be written as 
(la:ip+,Aj'-;Ai)Jx2P+,A^;A2)|a:3p+,A:^;A3) ) with X)f=i = Ij Ef=i = 0 > and A = ElLi 
Aplying on it we obtain {\xip^,k^+xip-^-, Ai)|x2p'*', A:^+X2p-^; A2)|x3p"'‘, A;^+X3p-'-; A3) 

) now with p-^ 7^ 0 . The Xi and indicate relative (invariant) parameters'’^ and do not depend 
upon the reference frame. The Xi is the fraction of the total longitudinal momentum carried by the 

particle while its transverse momentum. The state of a pion with momentum (p’^jP'^), for 
example, may be expressed as an expansion over the LF Fock states constituted by the different 
number of partons 

V— > f - dx cPk^' 

where [8] the sunamation is over ah the Fock states n and spin projections A*, with Hidxi - 
ILidXi d(J^Xi - 1), and flidPiA = IlidfcA The wave function of the parton i/>n/^(x, k-'-) 

indicates the probability ampUtude for finding inside the pion the partons in the Fock state n 
carrying the 3 -nioinenta {xip'^,Xip-^ + k^). 

The discrete symmetry transformations may also be defined on the LF Fock states [8, 15 ] For 
example, under the conventional parity V the spin operator Jz is not invariant. We may rectify 
this by defining LF Parity operation by We find then Bi -4 -B^Bz B2,P^ ^ 

P±,F^ -4 -Pi,p 2 -4p2etc. suchthatP'^|p+,p-L;Tn,s,A,..) ~ \p'^,-p^,p^-,m,s, -A,..). Similar 
considerations apply for charge conjugation and time inversion. For example, it is straightforward 

’•’’We note pi+ = Xjp+, pj-*- = Xip-^ + Aj’-, and (p ■ p) = (2p+p“ - p-'-p-’-) = [(mj^ 4 Aj-’-Ai-’-j/xi] where 

(Pi • Pi) = rrti^ and = p**. 



Perspectives of Light-Front QFT 


469 


to construct [15] the free LF Dirac spinor x(p) — [V^p'^A'^ + (m — 7“p“) A~]x/'\/v^p+7n which 
is also an eigenstate of J 3 with eigenvalues ±1/2. Here = 7^7=‘=/2 = (A'^)^ 

(A^)^ = A^, and xip) = X j^x — Xr The conventional (equal-time) spinor can also be 
constructed by the procedure a nalogous to t hat followed for the LF spinor and it has the well known 
form Xconip) = (m + 'y.p)xl \/2m(jP + m). Under the conventional parity operation T : x'(p') = 
C7°x(p) (since we must require 7^" = S{L)YS~^{L), etc.). We find xHjp) = c[v5p“A" + {m- 

7“p“) A+]x/\/^ m. For p ^ p it is not proportional to x(jp) contrast to the result in the 
case of the usual spinor where 'y^Xcon(jp\ -pD = Xconip) for E>0 (and ^^Vcon(p\ -p) = -Vconip) 
for E < 0 ). However, applying parity operator twice we do show x^'ip) = <^x(p) hence leading to 
the usual result (? = ± 1 . The LF parity operator over spin 1/2 Dirac spinor is = c( 2 Ji)^^ 
and the corresponding transform of x is shown to be an eigenstate of J3. 

Appendix C: BRS-BFT Quantization on the LF of the CSM 

We apply here the recently proposed BFT procedure [60] which is elegant and avoids, the 
computation of Dirac brackets. It would thus get tested [61] on the LF as well and it also allows 
us to construct (new) effective Lagrangian theories. 

We convert the two second class constraints of the bosonized CSM with a > 1 into first class 
constraints according to the BFT formalism. We obtain then the first class Hamiltonian from the 
canonical Hamiltonian and recover the DB using Poisson brackets in the extended phase space. 
The corresponding first class Lagrangian is then found by performing the momentum integrations 
in the generating functional. 

(a) Conversion to First Class Constrained Dynamical System 

The bosonized CSM model (for a > 1) is described by the action 

ScsM = Jdl^x + + + , (C. 1) 

where a is a regularization ambiguity which enters when we calculate the fermionic detenninaiit 
in the fermionic CSM. The action in the LF coordinates takes the form 

ScsM ^ J — (5-{-A_ — + 2eA_5-}-0 + , (C. 2) 

We now make the separation, in the scalar field (a generalized function) : (f>(T,x~) = u{t) + 

9?(r,x”). The Lagrangian density then becomes 

jC = ^(9-j_A_ — (9-A+)^ + d^cpd^ip 4- ae\A^ H (5+^ + <9+a;)]A^, (C. 3) 

2 ae 

We note that the dynamical fields are A- and tp while A+ has no kinetic term. On making a 
redefinition of the (auxiliary) field A+ we can recast the action on the LF in the following form 

ScsM — J <p<p' -h —(A— — A-i-')^ — 2eA—<p -h ae^ A+A— , (C. 4) 

The canonical momenta are given by 

7r+ = 0, 

7 r“ = A- - A!^ - 2etp, 

■Kw = p' ■ (C- 5) 



470 


Prem P Srivastava 


We follow now the Dirac’s standard procedure in order to build an Hamiltonian framework on the 
LF. The definition of the canonical momenta leads to two primary constraints 

0, (C. 6) 

fli = (tTcp —(p')^O (C. 7) 


and we derive one secondary constraint 

02 ^ d—TT H — h2e^^ “h 0. 


(C. 8) 


This one follows when we require the r independence (e.g., the persistency) of the primary con- 
straint with respect to the preliminary Hamiltonian 

H' = + j dx U+T+ + J dxuiCli, (C. 9 ) 

where He is the canonical Hamiltonian 


He 


Lf. 





+ 2 e(p)^ + ('?r + 2 e(p)A^j^ — ae^A+A- 


(C. 10) 


and we employ the standard equal-r Poisson brackets. The u+ and ui denote the Lagrange multi- 
plier fields. The persistency requirement for Oi give conditions to determine ui. The Hamiltonian 
is next extended to include also the secondary constraint 




■ Hc '^' + j dx j dx uiCti + J dx U2^2 


(C. 11 ) 


and the procedure is now repeated with respect to the extended Hamiltonian. For the case a > 
1, no more secondary constraints are seen to arise and we are left only with the persistency 
conditions which determine the multipliers ui and U2 while is left undetermined. We also find^^ 
{Q.i,Ctj} = Dij {—2dxS{x - y)) where ij = 1,2 and Du = 1, D22 = D12 = D21 = -e 

and TT"^ is shown to have vanishing brackets with ni,2- The tt^ 0 constitutes a first class 
constraint on the phase space; it generates local transformations of A^ which leave the He invariant, 
{7:^ J He} = Q2 ^ 0 - The ni,fl2 constitute a set of second class constraints and do not involve ^ 4 + 
or TT"^. It is very convenient, though not necessary, to add to the set of constraints on the phase 
space the (accessible) gauge fixing constraint ^4. « 0 . It is evident from that such a gauge freedom 
is not available at the Lagrangian level. We will also implement (e.g., turn into strong equalities) 
the (trivial) pair of weak constraints A4 0 , tt"*" « 0 by defining the Dirac brackets with respect 
to them. It is easy to see that for the other remaining dynamical variables the corresponding Dirac 
brackets coincide with the standard Poisson brackets. The variables A_j.,7r"^ are thus removed 
firom the discussion, leaving behind a constrained dynamical system with the two second class 
constraints fli, O2 and the light-front Hamiltonian 


H^‘^' = “ J dx (tt + 2 e(p)^ -h J dx uiQi -i- j dx U2^2 (C. 12) 

which will be now handled by the BFT procedure. 

We introduce the following linear combinations 2 = 1 , 2 , of the above constraints 


Ti — Cl (fix -h ^^2) 

T2 = C2(ni~^n2) (C. 13 ) 


is-yy-g make the convention that the first variable in an equal- r bracket refers to the longitudinal coordinate 
= ar while the second one to t/"" = t/ 



Perspectives of Light-Front QFT 


471 


where Ci — ^2 = 1/ 1 / 2(1 + e/M), = ae^, and a > 1. They satisfy 




(C. 14) 


and thus diagonalize the constraint algebra. 

We now introduce new auxiliary fields in order to convert the second class constraint T * into 
first class ones in the extended phase space. Following BFT [60] we require these fields to satisfy 

{^^(or 77^), = 0, {</?(or TT^), = 0, (C. 15) 

{$"(x),<^^(y)} = 

where is a constant and antisymmetric matrix. The strongly involutive modified constraints 
T i satisfying the abelian algebra 


{Ti,f,-} = 0 (C. 16) 

as well as the boundary conditions, Ti l<j»t=o= then postulated to take the form of the 

following expansion 


fi(4^ TT^, if, = Ti + ^ f|"), ~ (#^r- 

n=l 

The first order correction terms in this infinite series are written as 

TP(a:) = j dyXij(x,y)¥(y). 

The first class constraint algebra of T i then leads to the following condition: 

{Ti,T,} + {fW,fW} = 0 


(C. 17) 


(C. 18) 


(C. 19) 


or 


(-2dxS{x - y))Sij + 


I 


dw dz Xik{x, {w, z)Xji(y, z) = 0. 


(C. 20) 


There is clearly some arbritrariness in the appropriate choice of and Xij which corresponds 
to the canonical transformation in the extended phase space. We can take without any loss of 
generality the simple solutions, 


u!^^(x,y) — -S^^e{x-y) 

Xij{x,y) = 6 ijdx 5 {x-y), (C- 21) 


Their inverses are easily shown to be 

y) 

{X-^yHx,y) 


~ 5 ijdx 5 {x - y) 
^ 5 'h(x - y), 


(C. 22) 


With the above choice, we find up to the first order 


Ti 


Ti + ff> 

Ti + a$‘, 


(C. 23) 


and a strongly first class constraint algebra 



472 


Prem P Srivastava 


The higher order correction terms (suppressing the integration operation ) 

= (n>l) (C.25) 

with 

^5”^ = (C. 26) 

m=0 m=0 

automatically vanish as a consequence of the proper choice of made above. The Poisson brackets 
are to be computed here using the standard canonical definition for and (p as postulated above. 
We have now only the first class constraints in the extended phase space and in view of the proper 
choice only contributes in the infinite series above. 

(b)- First Class Hamiltonian and Dirac Brackets 

We next introduce modified (’’gauge invariant”) dynamical variables F = (.A^,7r^,^,7r,;p) cor- 
responding to F = {Ay^yTC^.PyTt^p) over the phase space by requiring the the following strong 
involution condition for F with the first class constraints in our extended phase space, viz, 

{fuF}=0 (C. 27) 

with 

oo 

~ (C. 28) 

n—l 

and which satisfy the boundary conditions, F F- 

The first order correction terms are easily shown to be given by 

F^^\x) = - I dudvdz ^^{u)uJ-^ikiu,v)X-^'‘\v,z) {TKz),i^(:r)}(A,.,^..,). (C. 29) 

We find 


II 

S ! 

1 

2M 

9(ci#^ - C2^^) 


TT = 



^(1) = 

1 

2 

■(ci§^ +C2#^), 


5r(i^ = 

'V 


C,(l_g)$l+C2(l + ^)#' 

(C. 30) 


■where only the combinations (ci$^ ±C 2 $^) of the auxiliary fields are seen to occur. Furthermore, . 
since the modified variables F = F + F^^'> + up to the first order corrections, are found to be 
strongly involutive as a consequence of the proper choice made above, the higher order correction 
terms 

= (C.31) 

n H- 1 ^ 

with 

m=0 m=0 

(C. 32) 


gI”) = 



Perspectives of Light-Front QFT 


473 


again vanish. In principle we may follow similar procedure for any functional of the phase space 
variables; it may get, however, involved. 

We make a side remark on the Dirac formulation for dealing with the systems with second class 
constraints by using the Dirac bracket (DB), rather than extending the phase space. In fact, the 
Poisson brackets of the modified (gauge invariant) variables F in the BFT formalism are related 
[60] to the DB, which implement the constraints Tj 0 in the problem under discussion, by 
the relation = {f^g} ki=o- view of only the linear first order correction in CSM the 

computation of the right hand side is quite simple. We list some of the Dirac brackets 


{tt ,7r }d. 

= 7r-}|4f=( 


~(1) ~(1) 


= {-K- ,-ir- 


= v}k=o 



{(/5,7r“}D 

II 


, 1 






(C. 33) 


The other ones follow on using the now strong relations = fl 2 = 0 with respect to {,}jo and 
from it follows that the LF Hamiltonian reduces effectively to 

ffj/- = i y dar (tt- + (C. 34) 

The first class LF Hamiltonian if which satisfies the boundary condition if |$i=ro= is 

in strong involution with the constraints Ti , e.g., {Ti,if} = 0, may be constructed following the 
BT procedure or simply guessed for the CSM. It is given by 

dx{if- + 2e(pf (C. 35) 

which is just the expression in of with field variables F replaced by the F variables, which 
already commute with the constraints Tj. We do also check that {if , if } = 0 and we may identify if 
with the BRS Hamiltonian. This completes the operatorial conversion of the original second class 
system with the Hamiltonian He and constraints Cli into the first class one with the Hamiltonian 
if and (abelian) constraints 2^. 


(c)- First Class Lagrangian 

We consider now the partition function of the model in order to construct the Lagrangian 
corresonding to if in the canonical Hamiltonian formulation discussed above. 

We start by representing each of the auxiliary field by a pair of fields tt*, 2 = 1,2 defined 
by 

= Itt* — J du €{x - u) 9^{u) (G. 36) 

such that TT^, 6^ satisfy 

e^} = - y) etc,, (C. 37) 

e.g., the (standard Heisenberg type) canonical Poisson brackets. 

Then, The Phase Space Partition Function Is Given By the Faddeev formulae 



(C. 38) 



474 


Prem P Srivastava 


where 

5 = J d^x ^7r~A^ ^ J d^x (C. 39) 

with the Hamiltonian density Ti corresponding to the Hamiltonian H which is now expressed in 
terms of tt^) rather than in terms of The gauge-fixing conditions Fi are chosen such that the 
determinants occurring in the functional measure are nonvanishing. Moreover, F^ may be taken 
to be independent of the momenta so that they correspond to the Faddeev-Popov type gauge 
conditions. 

We will now verify in the unitary gauge^ defined by the original second class constraints: F^ = 
Qi = 0, i=l,2 being employed in the partition function, do in fact lead to the original Lagrangian. 
We check that the determinants in the functional measure are non- vanishing and field independent 
while the product of delta functionals reduces to 

S{7r^p - -}- 2e(p^ — Ad^) (C. 40) 


Since is absent from H we can perform functional integration over it using the first delta 
functional. The second delta functional is exponentiated as usual and we name the integration 
variable as A+ for convenience. The functional integral over 6^ and 9^ are easily performed due 
to the presence of the delta functionals and it also reduces % to (tt” 4- 2e<ic»)^/2. The functional 
integrations over the then decoupled variables and give rise to constant factors which are 
absorbed in the normalization. The partition function in the unitary gauge thus becomes 


-J 


'DA-'D-K~V(pVA+e*^ , 


with 


S = J d^x TT A- -h (p^(p -h (tt ' -h2e(p' -i- M^A-)A^ — ^(tt -\’2e(pf' 


(C. 41) 

(C. 42) 


Performing the shift 7r~ -4- 7r“ ~ 2e(p and doing subsequently a Gaussian integral over 7r“ we 
obtain the original bosonized Lagrangian with cj eliminated by the field redefinition of Aj^. It 
is interesting to recall that while constructing the LF Hamiltonian framework we eliminated the 
variable Aj^ making use of the gauge freedom on the LF phase space and it gave rise to appreciable 
simplification. However, on going over to the first class Lagrangian formalism using the partition 
functional this variable reappears as it should, since the initial bosonized action is not gauge in- 
variant due to the presence of the mass term for the gauge field. Making other acceptable choices 
for gauge-functions we can arrive at different effective Lagrangians for the system under consid- 
eration. It is interesting to recall that in the fermionic Lagrangian the right-handed component 
of the fermionic field describes a free field and only the left-handed one is gauged. It is also clear 
from our discussion that H proposed above is not unique and we could modify it so that it still 
leads to the original Lagrangian in the unitary gauge. The corresponding first class Lagrangian 
would produce still other gauge-fixed effective Lagrangians. 



Perspectives of Light-Front QFT 


475 


References 

[1] P.A.M. Dirac, Rev. Mod. Phys. 21 (1949) 392. 

[2] We recall the discovery of Kruskal-Szekers coordinates which threw a new light on the problem 
of the Schwarzshild singularity. 

The ± components of a tensor, for example, are defined by = Azp = (A° ± A^)/\/2 
and the metric may be read from A^B = A'^B'~ -f - A^B^ - A^B^, 

[3] S. Fubini and G. Furlan, Physics. 1 (1964) 229; R. Daahen and M. Gell-Mann, Phys. Rev. 
Lett. 17 (1966) 340; V. de Alfaro, S. Fubini, G. Furlan, and C. Rossetti, Currents in Hadron 
Physics, North Holland, 1993 and the references cited therein. 

[4] S. Weinberg, Phys. Rev. 150 (1966) 1313. 

[5] J.D. Bjorken, Phys. Rev. 179 (1969) 1547. 

[6] R.P.Feynman, Phys.Rev.Lett. 23 (1969) 1415. 

[7] L. Susskind, Phys. Rev. 165 (1968) 1535; K. Bardakci, M.B. Halpern, ibid 176 (1968) 1686; 
S.J. Chang and S.K. Ma, ibid 180 (1969) 1506; H. Leutwyler, Springer Tracts in Mod. Phys. 
50 (1969) 29; J. Jersak and J. Stern, Nuovo Cimento 59 (1969) 316; S.D. Drell, J.D. Levy, 
and T.M. Yan, Phys. Rev. D1 (1970) 1035; F. Rohrlich, Acta Phys. Austr. 32 (1970) 87; J.B. 
Kogut and D.E. Soper, Phys. Rev. D1 (1970) 2901; J.D. Bjorken, J.B. Kogut, and D.E. Soper, 
ibid D3 (1971) 1382; S.J. Brodsky, R. Roskies, and R. Suaya, ibid D8 (1973) 4574; S.J. Chang, 
R.G. Root and T.M. Yan, Phys. Rev. D7 (1973) 1173; G. ’t Hooft, Nucl. Phys. B72 (1974) 
461; R. Jackiw, Springer Tracts in Mod. Phys. 62 (1972) 1 and the refs, cited therein. 

[8] S.J. Brodsky, Light-cone quantized QCD and novel hadron phenomenology, SLAC-PUB-7645, 
1997; S.J. Brodsky and H.C. Pauli, Light-cone Quantization and QCD, Lecture Notes in 
Physics, vol. 396, eds., H. Mitter et. al.. Springer- Verlag, Berlin, 1991; S.J. Brodsky and 
G.P. Lepage, in Perturbative Quantum Chromodynamics, ed., A.H. Mueller, World Scientific, 
Singapore, 1989; 

C. B. Thorn, Phys. Rev. D20 (1979) 1435; ibid Phys. Rev. D20 (1979) 1934. 

[9] K.G. Wilson, T.S. Walhout, A. Harindranath, W.M. Zhang, R.J. Perry, and St. D. Glazek, 
Phys. Rev. D49 (1994) 6720; K.G. Wilson, Nucl. Phys. B (proc. Suppl.) 17 (1990); R.J. Perry, 
A. Harindranath, and K.G. Wilson, Phys. Rev. Lett. 65 (1990) 2959. 

[10] P.P. Srivastava, Light-front Quantization of Field Theory: Some New Results, Lectures at the 
IX Brazilian School of Cosmology and Gravitation, July 1998, Rio de Janeiro, Proceedings, 
Ed. M. Novello, preprint CBPF-NF-003/99, hep-th/9901024; Lectures on light-front quantized 
field theory: Spontaneous symmetry breaking. Phase transition in theory, Proc. XIV 
ENPC-Encontro Nacional de Particulas e Campos, Caxambii, MG, pp. 154-192, Sociedade 
Brasileira de Ffsica, Sao Paulo, SP, Brasil, 1993; hep-th/ 9312064; Nuovo Cimento A107 
(1994) 549. See [8, 9, 10, 32] for the extensive list of references. 

[11] R.J. Perry, Light-front quantum chromodynamics, nucl-th/9901080; Hadrons ’94, Eds. V, 
Herscovitz et al., World Scientific, Singapore, 1995. 

[12] See [32, 10]. 

[13] D. Bigatti and L. Susskind, Review of matrix theory, hep-th/9712072; Phys. Lett. B425 (1998) 
351, hep-th/9711063. 

[14] E. Witten, Commun. Math. Phys. 92 (1984) 455. 

[15] P.P. Srivastava, Mod. Phys. Letts. A13 (1998) 1223; See also, P.P. Srivastava, in Geometry, 
Topology and Physics, Apanasov et. al. (Eds.), Walter de Gruyter & Co., Berlin, New York, 
1997, pp. 260; hep-th/9610149 and 9610044. 



476 


Prem P Shvastava 


[16] P.P. Srivastava, Phys. Letts. B448 (1999) 68 ; hep-th9811225. 

[17] P.P. Sriveistava and S.J. Brodsky, Light-front quantized QCD in covariant gauge, SLAC-PUB- 
8168; hep-ph/9906423. 

[18] P.P. Srivaistava, Nuovo Cimento A108 (1995) 35; see also [10]. 

[19] A.N. Mitra and S. Bhatnagar, Int. J. Mod. Phys. A7 (1992) 121. 

[20] A.N. Mitra, Phys. Lett. B (1999);hep-ph/9812404, and hep-ph/9901421; 

[ 21 ] J. Carbonel et al., Phys. Rep. 400 (1998) 215. 

[22] See, S.S. Schweber, Relativistic Quantum Field Theory, Row, Peterson and Co., New York, 
1961; J.D. Bjorken and S.D. Drell, Relativistic Quantum Fields, McGraw Hill, 1965; L.H. 
Ryder, Quantum Field Theory, Cambridge University Press, 2 nd Edition, 1996. 

[23] On the physical grounds we must require the cluster decomposition principle, which requires 
that distant experiments give uncorrelated results. See, S. Weinberg, in Conceptual foundations 
of quantum field theory, Ed. T.Y. Cao, Cambridge University Press, 1999; N.N.Bogolubov 
et.al., Introduction to Axiomatic quantum field theory, Benjamin, 1975. 

The locality does not seem to be strictly required; the front form theory may show nonlocality 
(Sec. 1 . 1 ) along the longitudinal direction even when the corresponding instant form theory 
is formulated as a local theory. 

[24] See for example, I.N. Sneddon, Elements of Partial Differential Equations, McGraw-Hill, NY, 
1957, pp. 111-115 

[25] P.A.M. Dirac, Lectures in Quantum Mechanics, Belfer Graduate School of Science, Yeshiva 
University Press, New York, 1964; Can. J. Math. 2 , 129 (1950); E.C.G. Sudarshan and N. 
Mukunda, Classical Dynamics: a modem perspective, Wiley, NY, 1974. See also L. Faddeev 
and R. Jackiw, Phys. Rev. Lett. 60 (1988) 1692. 

[26] S. Coleman, Commun. Math. Phys. 31 (1973) 259. 

[27] The LF components of the four-momentum are = [k" ,k'^ ,k-^) where k^ = (fc°±/u^)/\/2. 

Here k" is the LF energy while k^ and indicate the transverse and the longitudinal 
components of the momeritum respectively- For a free massive particle on the mass shell we 
have the dispersion relation: = (/:-^^ -}- m^) > 0 so that k^ are both positive when 

k^ > 0 or both negative when k^ < 0. It has no square root as found in k^ = -f. 777,2 

conservation of the total longitudinal momentum does not permit the excitation of massive 
quanta by the LF vacuum which has vanishing longitudinal momentum. It should, however, be 
noted that when dealing with the momentum space loop integrals, a significant contribution 
may arise from such configuration in the integrand; the reason being that we have to deal with 
the products of several distributions. The components {k^,k'^, k^) in the instant form theory 
on the other hand may take positive or negative values and the conventional theory vacuum 
state may contain an arbitrary number of particles (and antiparticles) which may mix with 
the vacuum state, with no particles, to form the ground state. 

[28] P.P. Srivastava and E.C.G. Sudarshan, Phys. Rev. 110 (1958) 765. 

[29] E. Fermi, Prog. Theo. Phys. (Japan) 5 (1950) 570; Phys. Rev. 92 (1953) 452. 

[30] For massless particles the correlation ceases to exist at the point 0 since 2p+p“" = 

p^p-^ 0. 

[31] J. Barcelos Neto and C. Wotzasek, Europhys. Lett. 21 (1993) 511; R. Amorim and J. Barcelos 
Neto, Zeit. Phys. C 68 (1955) 513; F.P. Devecchi and H.O. Girotti, Phys. Rev. D 49 (1994) 
4302 and refs, contained therein. 



Perspectives of Light-Front QFT 


477 


[32] Such constraints on the potential, as illustrated by Dirac [1] in his paper, are required when we 
unify in relativistic theory the principles of special relativity and the principles of quantization. 
It is interesting to note that soon after in 1950-52 he formulated also the systematic method 
(Dirac procedure) for constructing Hamiltonian formulation for constrained dynamical system. 

That the constraint (3.1) resulted in nonlocal LF Hamiltonian and consequently gave a de- 
scription of SSB were pointed out in : P.P. Srivastava, On spontaneous symmetry breaking 
mechanism on the light-front quantized field theory, Ohio-State University preprint 91-0481, 
Slac database no. PPF-9148, November 1991 and the references cited therein. The broken 
continuous symmetry in 3 4- 1 dimensions and the tree level Higgs mechanism is discussed in 
Ohio-State University preprint 92-0012, Slac database no. PPF-9202, December, 1991. Spon- 
taneous symmetry breaking mechanism on the light-front quantized field theory- Discretized 
formulation, Ohio-State University preprint 92-0173, Slac database no. PPF-9222, April 1992, 
available as scanned copies on the Spires hep-th data base. See also the papers contributed to 
XXVI Inti Conference on High energy Physics, Dallas, Texas, August 1992, AIP Conf Proc., 
272 (1993) 2125, Ed. J.R. Sanford, database: cnsum c92/08/06, conf(uspires-slac), papers 135, 
136; University of Padova, Report No. DFPF/92/TH/58. 

[33] H.C. Pauli and S.J. Brodsky, Phys. Rev. D32 (1985) 2001. 

[34] P.P. Srivastava, Light-front quantization and Spontaneous Symmetry Breaking- Discretized for- 
mulation, Hadron Physics 9^, pp. 253, Eds. V. Herscovitz et. al.. World Scientific, Singapore, 
1995; hep-th/9412204, 205. 

[35] G. Parisi, Statistical Field Theory, Addison- Wesley, 1988. 

[36] B. Simon and R.B. Griffiths, Commun. Math. Phys. 33 (1973) 145. 

[37] See, J. Leite Lopes, Gauge Field Theories, Pergamon Press, 1981. 

[38] N- Marcus and J. Schwarz, Phys. Lett. 115B (1982) 111; D.J. Gross, J.A. Harvey, E. Martinec, 
and R. Rohm, Phys. Rev. Lett. 54 (1985) 502. 

[39] X.G. Wen, Phys. Rev. Lett. 64 (1990) 2206; M. Stone, Phys. Rev. B41 (1990) 212. 

[40] W. Siegel, Nucl. Phys. B238 (1984) 307. 

[41] R. Floreanini and R. Jackiw, Phys. Rev. Lett. 59 (1987) 1873. 

[42] P.P. Srivastava, Phys. Rev. Lett. 63 (1989) 2791. 

[43] P.P. Srivastava, in preparation, 

[44] M.E. Costa and H.O. Girotti, Phys. Rev. Lett. 60 (1988) 1771. 

[45] A. Tseytlin and P. West, Phys. Rev. Lett. 65 (1990) 541. 

[46] See Barcellos et al. in [31]. 

[47] W.T. Kim, J.K. Kim, and Y.J. Park, Phys. Rev. D44 (1991) 563. 

[48] C. Imbimbo and A. Schwimmer, Phys. Lett. B193 (1987) 455; J.M.F. Labastida and M. 
Pernici, Phys. Rev. Lett. 59 (1987) 2511. 

[49] H.O. Girotti, M. Gomes, and V.O. Rivelles, Phys. Rev. D45 (1992) R3329; D.S. Kulshreshtha 
and H.J.W. Muller-Kirsten, Phys. Rev. D45 (1992) R393. 

[50] In the conventional metric, = diag (1, — 1,— 1, — 1), /x, i/ = 0,1,2, 3, the 7 matrices are 

defined as usual, {7^,7*"} = 2r}^^, 7^7^7^ = 7^^ C'y^C~^ = -7^^, C = ^7^7®, S3 = S3 = 
^^72, S2 = Si = *7^7®, 75 = n°7VT* = 75^ 75^ = I, [75, s] = 0 etc. No explicit 

representation of 7^ matrices is used in our discussions. 



478 


Prem P Srivastava 


[51] Here it is understood that an unsymmetrical expression like is to be replaced by its 

symmetrized form 5^ ^7^^] /2. It is convenient to work on the LF in terms of the 

projected four spinors ‘ip±^ 

[52] J. Tiomno, II Nuovo Cimento, 1 (1956) 226. 

[53] See R. Jackiw in ref. [7] 

[54] P.P. Srivastava, Europhys. Lett. 33 (1996) 423 and LF dynamics of Chern-Simons systems, 
ICTP, Trieste preprint IC/94/305; hep-th/9412239. 

[55] See, A. Dhar, G. Mandal and S.R, Wadia, Phys. Lett. B329 (1994) 15; D.J. Gross and I. 
Klebanov, NucL Phys. B352 (1990) 671. 

[56] See, S.J. Brodsky et al. [8] and references cited therein. 

[57] See for example, E. Abdalla, M.C. Abdalla and K. Rothe, Non-Perturbative Methods in Two 
Dimensional Quantum Field Theory, World Scientific, Singapore, 1991 and the references 
cited therein; D. Boyanovsky, Nucl. Phys. B294 (1987) 223; A. Bassetto, L. Griguolo, and P. 
Zanca, Phys. Rev. D50 (1994) 1077. 

[58] R. Jackiw, R. Rajaraman, Phys. Rev. Lett. 54 (1985) 1219, 2060(E). 

[59] A. Bassetto, G. Nardelli, and R. Soldati, Yang-Mills theories in algebraic noncovariant gauges, 
World Scientific, 1991; G. Leibbrandt, Noncovariant gauges, Quantization of Yang-Mills .and 
Chem-Simons theory in axial-type gauges, World Scientific, 1994. 

[60] LA. Batalin and I.V. Tyutin, Int. J. Mod. Phys. A6 (1991) 3255. 

[61] P.P. Srivastava, BRS-BFT quantization of the CSM on the light- front, paper LP-002, Session 
P17, Inti Symp, on Lepton-Photon Interactions- LP'97, July 1997, Hamburg. Available as .ps 
file on the DESY database or as hep~th/9901024. 



18 . Gauge Symmetry In Chiral Electrodynamics 


D.S.Kulshreshtha * 

Department of Physics k Astrophysics. 

University of Delhi, Delhi-110007, India 


Abstract 

The constrained dynamics and local vector gauge invariance of the various field theory 
models describing the chiral electrodynamics in one-space one-time dimension is revisited. 


1. Introduction 

The one-space one-time ((1+1)-) dimensional field theories [1-26] axe the simplest toy models which 
are exactly soluble and renormalizable and their study has given rise to a very important concept 
namely, that any fermion (plus boson if desired) field theory always has its boson equivalent 
field theory called the bosonized field theory [3]. As a consequence of this equivalence called 
bosonization (or fermionization) many interesting features of two-dimensional (2D) field theories 
have been revealed [1-26]. This concept of bosonization (originally discovered in the context of 
2D field theories) has also been very useful in the understanding of four-dimensional phenomena 
that may be described by an effective 2D theory [27]. The rather convincing demonstration of 
quark confinement in exactly soluble 2D models is perhaps one of the best known examples of 
the successes of this field [3]. Another outstanding idea discovered in this field is that the vector 
gauge boson of chiral electrodynamics (CED) has an anomaly generated mass [7,8], in contrast 
to the picture of spontaneous symmetry breaking with Higgs mechanism for the vector boson 
mass generation in the standard model. The 2D theories could be physically relevant in many 
circumstances. The 2D conformally invariant quantum field theories e.g., describe the long range 
behaviour of correlations in planar statistical systems udergoing second order phase transitions 
[28]. There are also some physical systems whose motion is dynamically constrained to lie in a 
subspace of the full space-time and a lower dimensional model adequately describes the reduced 
dynamics. Linear polymers like polyacetyline e.g., have been described by the use of an anomalous 
2D field theory [29]. The so-called anomalous field theories are infact, the theories that do not 
posses the gauge symmetry [30] so that they lack the gauge invariance. We would take up these 
ideas in details in the next section. The last (but not least) reason is that these 2D field theory 
models are the basic building blocks of some larger theories in the field of string theories [29]. 
The plan of the article is as follows. In Secs. 2 and 3, we discuss some basics about the concepts 
related to the gauge symmetry and CED. In Secs. 4, 5 and 6, we consider some specific theories 
namely, the chiral Schwinger models (CSM’s) describing 2D-CED. In Sec. 7, we briefly discuss the 
self-dual fields called the chiral bosons (CD’s) which represent the chiral Fermi theories, and we 
finally summarise the article in the last section. 


2. The Gauge Symmetry 

The existence of symmetries in physics lead to rather important conservation laws. The so-called 
symmetries are infact, the transformations that leave the action of the theory unchanged or in- 
variant [30]. One way of classifying the symmetries that exist in nature is in terms of the so-called 


* Email : dsk@physics-du.ac.in 



480 


D S Kulshreshtha 


external or space-time symmetries and the internal symmetries. External symmetries include 
space- and -time translational invariance (leading to the conservation of linear momentum and 
energy) as well as the invariance under Lorentz transformations including rotations and the boosts 
(leading to the conservation of angular momentum and Lorentz-boosts). In contrast to the above 
examples, the example of gauge invariance which leads to the existence of conserved currents and 
conserved charges, is an example of the so-called internal symmetries [30]. In the present article, 
we would consider the implications of the continuous (and in particular, the internal) symmetries 
in the framework of Lagrangian field theories. For this purpose, let us consider field theory defined 
by the action integral functional : 

5 = y* d^a:£((^jb,5p^jfe); = 0, 1,..., (P - 1) (1) 

where £ = £{(l)k , 9^0jfe) is the Lagrangian density (LD) of the theory and D is the dimension 
of the space-time in the Minkowski space. We now consider the transformation of the coordinates 

X jj, “ ^ *— ^ (^) 

The space-time point x'is in general a function of x and (= , includes infinitesimal 

translations as well as rotations. This transformation induces a transformation in the fields <i>k{x) 

(f>k (x) — (fk {x) = <f)k (a:) + S(f>k {x) (3) 

where 5(j)k{x) is a small symmetry transformation. For the case x^= x, the transformation is 
called an internal transformation. In this case the space-time point remains unchanged so that one 
has 5xp^ = 0 for an internal transformation. In this case (x^'= x^) one has : 

6<l)k{x) = 5<l>k{x) = (fk{x) - (j)k{x) = [i0^ Ta (t>k{x)] ; a = 1,2,. ..,iV (4) 

where Ta are the generators of the symmetry transformation {S<p{x)) of 0(x) and are the N 
gauge parameters corresponding to N independent transformations. Further, if = I3^{x^) is an 
arbitrary function of Xf^ then the symmetry transformation is called a local transformation, where 
as = constant leads to the so-called global symmetry transformations. Also with any symmetry 
transformation there exists in general, an associated Noether current J^ix), which is conserved if 
the symmetry is exact so that SS = 0, giving rise to the continuity equation : 

5%"(x) = 0 (5) 

for the conserved Noether current J^{x) (which is called the vector gauge current (VGC) when 
the symmetry is a vector gauge symmetry (VGS)) leading in turn to the existence of the conserved 
(global) charge Q^(t) : 

Q‘‘ = QHt)=Jd°-^xJSix) ( 6 ) 

One of the important experimental tests of a theory is whether the conserved quantities it 
predicts are indeed conserved. Infact, the identification of the transformations of the fields that 
leave the action of the theory invariant leads to important predictions of the theory without 
solving the equations of motion. The proper intuition about (5) can be obtained from the usual 
ED where the electromagnetic current satisfies a continuity equation which says that the charge 
is neither created nor destroyed locally [30]. Equation (5) only generalizes this result of ED to 
other kind of charges, A crucial feature of field theories with LGI is that for each independent 
internal LVGS there exists a vector gauge field and its corresponding vector boson particle that 
mediates (or carries the force) between the charged matter fields. For ED the gauge field is the 
electromagnetic vector potential A^(a;) and its quantum particle is the massless spin-1 photon. In 
fact, all interactions in general are mediated by vector bosons originating from the local symmetries 
which dictate the exact form of interaction. The interaction Lagrangian must be of the form [e 
J^(x) A;i(ar)], where the coupling constant e is defined as the strength with which the vector boson 
Af^{x) interacts with the VGC J^{x) [30]. Maxwell’s unification of electricity and magnetism can, 
infact, be viewed as the discovery that ED is described by the simplest possible LVGS implying the 
corresponding invariance. Maxwell’s addition of the displacement current to the field equations 
(the Maxwell’s equations) which was made to ensure the conservation of electromagnetic current 
turns out to be equivalent to imposing LGI on the Lagrangian of ED [30]. 



Gauge Symmetry in CED 


481 


Although the above discussions are classical, the results are usually correct in the quantum 
theory derived from a classical Lagrangian. In some cases, however the quantum corrections 
contribute a non-zero term to the right hand side of the continuity equations (5) and these terms 
are called anomalies. For global symmetries these anomalies can often improve the predictions 
from Lagrangians that have too much symmetries when compared with data because anomalies 
wreck the symmetry that was never present in the quantum theory even though the classical 
Lagrangian had the symmetry. However, for local symmetries (LVGS) presence of anomalies is 
rather disastrous [30,31]. A quantum theory is locally symmetric only if its gauge currents satisfy 
the continuity equation (5), otherwise the local anomalies simply change the theory. In view of 
this one has to deal rather carefully with this kind of gauge anomalous field theories [30-31]. The 
main object of the present article is to investigate some of the gauge anomalous theories in (1+1)- 
dimension describing the CED [1-26] and to show as to how one could try to restore the LVGS or 
the LGI to the otherwise gauge anomalous theory. An uunderstanding of the constrained dynamics 
of a system is usually found to be very helpfull in this context. The LVGS of a dynamical system, 
as we would see in the following, is very intemately connected with the constrained dynamics of 
the dynamical system [31], and the Hamiltonian formulation a la Dirac [32] is, particularly suited 
to discover the dynamical generators of internal symmetries of a constrained dynamical system. 
The gauge-invariant (GI) systems in general play an important role in the theoretical description 
of the fundamental laws of nature. In fact, most of the physical systems of interest e.g., the 
ED, quantum ED (QED), QCD, electro- weak theory and the gravity theory, are all constrained 
systems [30-31]. A constrained dynamical system, infact, is one which is defined in terms of the 
over determined set of coordinates and the Hamiltonian formulation makes it easier to keep track 
of all the coordinates, canonical and redundant where the complete set of constraints emerges 
naturally. It may be important to mention here that the nature of the matrix of the poisson 
brackets (PB’s) of the constraints of the theory, as we would see in the later sections, determines 
the nature of the set of constraints of the theory and also as to whether the theory is GI or not. 
Thus if the above matrix is singular, then the set of constraints of the theory is first-class and 
the theory is GI (and also if this matrix is a null matrix (and therefore also singular) then the 
theory is a true or bonafide GI theory). On the other hand, if this matrix is non-singular then 
the set of constraints of the theory is second-class and the theory is gauge-non-invariant (GNI). 
This, infact, could even be taken as a criterion for differentiating the GI systems from the GNI 
ones [4,5,9,10,13,14,18,19,21,22,25,26]. These GI systems could then be quantized under some 
appropriate gauge choices or the gauge-fixing conditions (GFC’s) [30-31]. Further, in the usual 
Hamiltonian formulation of a GI theory under some GFC’s, one necessarily destroys the gauge 
invariance of the theory by fixing the gauge (which converts a set of first-class constraints into a 
set of second-class constraints, implying a breaking of LGI under the gauge-fixing). To achieve 
the quantization of a GI theory such that the gauge invariance of the theory is maintained even 
under gauge-fixing, one goes to a more generalized procedure called the Becchi-Rouet-Stora and 
Tyutin (BRST) formulation [33]. In the BUST formulation of a GI theory, the theory is rewritten 
as a quantum system that possesses a generalized gauge invariance called the BRST symmetry 
[33]. For this, one enlarges the Hilbert space of the GI theory and replaces the notion of the gauge 
transformation, which shifts operators by c-number functions, by a BRST transformation, which 
mixes operators having different statistics. In view of this, one introduces new anti-commuting 
variables called the Faddeev-Pppov ghost and anti-ghost fields, which are Grassmann numbers 
on the classical level and operators in the quantized theory, and a commuting variable called 
the Nakanishi-Lautrup field. In the BRST formulation, one thus embeds a GI theory into a 
BRST invariant system, and quantum Hamiltonian of the system (which includes the gauge-fixing 
contribution) commutes with the BRST charge operator as well as with the anti-BRST charge 
operator and the new symmetry of the quantum system (the BRST symmetry) that replaces the 
gauge invariance is maintained (even under the gauge-fixing) and hence projecting any state onto 
the sector of BRST and anti-BRST invariant states yields a theory which is isomorphic to the 
original GI theory [31,33]. The unitarity and consistency of the BRST-invariant theory described 
by the gauge-fixed quantum Lagrangian is guaranteed by the conservation and nilpotency of the 
BRST charge. 



482 


D S Kulshreshtha 


Also, the relativistic quantum dynamics of a physical system could be studied either in the 
conventional formulation on the hyperplane = constant, called the instant-form (IF) [34] or on 
the hyperplanes of the light-front : {x^ = constant, called the front-form (FF) a la Dirac [34]. 

In the present work, we would study the theories describing CED in both the forms of dynamics. 


3. Chiral Electrodynamics 

The Schwinger model describing ED in one-space one-time dimension with massless fermions and 
its chiral versions called the chiral Schwinger models (GSM’s) have been of a very wide interest 
in the recent years [1-26]. The GSM describes a massless Dirac field ?/;(x,i) in two dimensions 
with only one of its chiral components coupled to a U(I) gauge field A^{x^t) [7,10]. The first 
GSM was introduced by Hagen [6] as a new example of an exactly solvable field theory in the 
(l-fl)-dimension. In this model the gauge field with a bare mass was considered [6]. Jackiw and 
Rajaraman [7,8], later on, considered the gauge anomalous theory without the bare mass term. By 
studying the field equations and propogator obtained from the effective gauge field action, they 
concluded [7,8] that the theory was not gauge invariant, but was unitary and amenable to particle 
interpretation [7,8]. They also found that the vector gauge boson necessarily acquires a mass 
when consistency and unitarity are demanded [7,8]. One of the remarkable achievements of the 
studies of such theories as mentioned at the beginning has been the development of the fermion- 
boson correspondence in 2D quantum field theories. The other important achievement has been 
in the field of understanding the phenomena of gauge anomalies and the gauge anomalous field 
theories [7-22]. The JR-GSM [7,8] is found to admit exact solutions in a positive metric Hilbert 
space, respecting unitarity, provided the JR regularization parameter a (introduced in Ref. [7]) is 
restricted to the range a > 1 [7,8], for which the theory is sensible. Infact, the model is seen to 
yield a sensible theory for a class of regularizations [7,10]. The spectrum of the theory depends 
on the regularization in a crucial way and it is seen to contain, for a >1, a massive photon in 
addition to a massless fermion, and for a = 1, only a maissless fermion [7,8]. The JR-GSM is seen 
to lack the LVGS and is therefore gauge anomalous. It is rather well known that corresponding 
to a GNI theory, a GI theory could be constructed by the inclusion of the so-called Stueckelberg 
term (ST)/ Wess-Zumino term (WZT) [35] in the action of the GNI theory. The JR-GSM has 
been studied rather widely in the recent years [7-10]. In particular, the Hamiltonian and BRST 
formulations of the GI versions of this theory have been studied in the IF in Refs. [8,9], and in the 
FF in Ref. [10], where the GI versions of this theory have also been constructed by the inclusion of 
an appropriate ST/ WZT. The physical contents of the original GNI theory are also recovered under 
a special choice of gauge and the equivalence of the quantized GI and GNI theories is established. 

Very recently, Mitra [11] has considered a new regularization [11,12] which does not belong 
to the above class. With this regularization [11,12] the photon is once again massive and the 
massless fermion present in the theory has (unlike the JR-regularization) a chirality opposite to 
that entering the interaction with the electromagnetic field [11,12]. Further, this regularization, 
being in accordance with the Faddeev’s picture [35] of anomalous gauge theories, has been called 
by Mitra [11] as the Faddeevian regularization [11-14]. 

If the matrix of the PB’s of the constraints of the theory becomes non-singular because of the 
non-vanishing PB of the Gauss law constraint of the theory with itself (called Faddeev’s anomaly 
[35]), so that the constraints become second-class and the theory becomes GNI (or it looses LGI) 
because of this Faddeev’s anomaly, then the theory fits into the Faddeev’s scenario [35]. In the 
GSM with the Faddeevian regularization considered by Mitra [11,12], the Faddeevian mechanism 
works because the constraints of the theory become second class through the Faddeev’s anomaly 
for the Gauss law constraint of the theory. The above GSM with the Faddeevian regularization 
has been studied in the IF in Refs. [11,12,T4], where the IF theory is seen to be GNI possessing 
a set of three second-class constraints. The corresponding FF theory is also found to be GNI [13] 
possessing a set of three second-class constraints. The Hamiltonian and BRST formulations of the 
GI versions of this theory have been studied in the IF in Ref. [14], and in the FF in Ref. [13]. The 
Mitra-GSM with the Faddeevian regularization has a mass-like term for the vector gauge boson 



Gauge Symmetry in CED 


483 


Afj_ different than those of the class of models called JR-CSM’s and may be taken as a signature of 
new regularization [11-14]. This theory in contrast with the JR-CSM is seen to possess a self-dual 
boson which could also be thought of as a chiral fermion. 

Yet another important CSM is due to Haxada [15], who has, in particular, constructed a gauged 
Floreanini-Jackiw [16] action which describes a CSM [15-19] in terms of chiral bosonization (as 
explained below) [15]. This model is seen to possess a set of three second-class constraints and 
consequently it describes a GNI or a gauge anomalous theory. An appropriate ST/WZT for 
this GNI-theory in the IF has been calculated in Ref. [18] where the Hamiltonian and BRST 
formulations of the resulting GI (and consequently a gauge non-anomalous theory) have also been 
studied. Now while solving the minimal CSM (the CSM where the right-handed (or equivalently 
the left-handed) fermion is absent), one is faced with a technical difficulty. In order to construct 
an operator solution, one usually makes use of the knowledge of a solution of the bosonized model. 
Because the minimal CSM contains only the left-handed fermion, one is not able to obtain its 
bosonized form by the usual bosonization, and one is led to consider its chiral bosonization [15]. 
Harada [15] has considered the chiral bosonization of the minimal CSM and has obtained a gauged 
action [15] corresponding to the Floreanini-Jackiw action [16], from the conventional bosonic one 
of the CSM, by imposing the chiral constraint (tt - « 0 (cf. Ref. [15])) in phase space. After 

obtaining this equivalent bosonic action, the equations of motion have been solved in the GNI 
formulation. The bosonic solution is found to be completely satisfactory. Harada has further 
constructed [15] a fermionic operator solution of the minimal CSM in covariant gauges in the 
GI formulation [15]. He found a free chiral fermion (self dual chiral boson) and a free massive 
scalar (boson) with the desired mass, as physical asymptotic fields in a positive-definite Hilbert 
space. The existence of a physical free chiral fermion distinguishes the CSM from the (vector) 
Schwinger model [1-5] (which has no physical asymptotic fermion) and implies that the fermion is 
not confined. This minimal CSM is also found to be completely consistent like the usual JR-CSM 
[6-10]. The Harada’s CSM in the FF [19] is seen to possess a set of three first-class constraints 
and consequently describes a GI theory, in contrast to the IF theory [15,18] which is GNI owing 
to the second-class nature of the set of constraints of the theory. The Hamiltonian and BRST 
formulations of this FF theory have been studied in Ref. [19]. In section 6, we would consider this 
theory in the IF as well as in the FF, in details. 


4. The Generalized Schwinger Model 

The generalized Schwinger model (GSM) which describes the QED in (H-l)-dimension with 
massless fermions is described by the LD [1-5,7-10] : 

+ |ejr ^>(1 - ~ (7a) 

9i = Kez, - en) ; 92 = \{eL + e-n) ; (7b) 

75 _ ^ 0^1 = Q _^^ ) , 7 ^ 7 ® = -£'“' 7 .' (7c) 

~ diag{+l, -1) ; ■,e°^ = +l; fi,iy = 0,1 (7d) 

which is equivalent to its bosonized form [1-5,7-10] : 

£t = [ldi.<t>d'^<i> + i9i9^‘'-92end^<t>A„-{F^uF^‘' + ^A^A>‘] ( 8 ) 

In the rest of the article we would work with the above conventions. The mass term for Af, here 
arises from the regularization ambiguities associated with the definition of the current. The case 
of CSM is obtained from the GSM by setting gi = g2 = 9 (i-®- = 0) ; and = g^a , where 

a is the JR-regularization parameter. The case of vector Schwinger model (VSM) is obtained by 
setting 9i = 0, g2 = 9 (i.e. bl = cr) ; and M = 0. Here ei = br implies a vector-like theory in 
this case [1-5]. Also in the case of VSM, demanding the regularization to be gauge-invariant fixes 
a = 0 i.e. M = 0. On the other hand, in CSM [7-10], no choice of a can make the theory GI and 
therefore a is left as a free parameter. The Hamiltonian and BRST formulations of the vector [1-5] 
and chiral [7-10] theories in the IF have been studied in Refs. [4,9] and in the FF in Refs. [5,10]. 
The VSM in the bosonized form is described by the LD [1-5] : 

£ := - 9e>‘''d^<t>Au - ; ii,u = 0,l 


( 9 ) 



484 


D S Kulshreshtha 


Equation (11) describes the theory in the IF [1,4] and is seen to possess a set of two constraints 
: VLi =Ilo^0 and 0.2 — 9(l>) « 0. Where Oi is a primary constraint and O2 is a secondary 

constraint. Here, Ho, J5(= H^) and 11 are the momenta canonically conjugate respectively to Ao, 
Ai and 0. The divergence of the VGC for the IF theory is seen to vanish implying that the theory 
possess (at the classical level) a VGS. The matrix of the PB’s of these constraints is seen to be 
singular implying that the set of the constraints is first-class and that the theory is GI. The action 
of the theory is infact, seen to be invariant under the LVGT : Scj) = 0, SAo = /?, SAi = /J, 511 = 

511^ = = 0. Here /? = l3{x,t) is an arbitrary function of the coordinates. The theory could thus 

be quantized under some gauge choice. The Hamiltonian and BRST formulations of this theory 
in the IF have been studied in Ref. [4]. In the FF [34] one defines the light-cone coordinates : 

:= x^) IV2 , and then writes all the quantities involved in the LD in terms of x^ instead 

of x^ and . After doing this the LD in the FF reads [5] : 

£ = [{d+4>){d-<f>) + - 9id-^)A- + - d-A-f] (10a) 

A^=A± = {Ao ± Ai) ly/2and d±4> = {^±<t>) /V2 (10b) 

The VGC is seen to be conserved , i.e., := [9+ J_ + = 0 , for the above theory, 

implying that the theory possesses (at the classical level) a VGS. The theory is seen to possess a 
set of three constraints: xi = (H"^) « 0 ; X 2 = « 0 ; xs = {d-Hr - g{d-(l))) ^ 0 . 

Where Xi ^iid X 2 are the primary constraints and xs is a secondary constraint. Here n+, H” and 
n are the momenta canonically conjugate respectively to A~,A^ and 0. The matrix of the PB’s 
of the constraints Xt is first-class and that the theory described by £( 12 ) is GI. The action of the 
theory is, in fact, seen to be invariant under the LVGT [5] : SA'^ = 9-/3, = 9-f./3, S(f> = 0, 

5u = 9+9+/3, 5u = 0 ; 5n+ = 0, JH^ = 0 , 5n = gd^f3 ; 5n^ = 0 , 5n^ = 0. Where /3 = /3(x-“, x+) 
is an arbitrary function of the coordinates. The theory could therefore be quantized under some 
appropriate gauge choice. 

The JR-CSM in the IF is described in the bosonized form by the LD (with = 0, 1) [7-9] : 

£ = [|9^(/)a^0 + e{g^- - 6^")9^M. - + lae^A^A^] (11) 

In (13) , the first term corresponds to a massless boson, which is equivalent to a massless fermion. 
The second term represents the chiral coupling of this fermion to the electromagnetic field The 
third term is the kinetic energy term and the last term is the mass term for the vector gauge boson 
and contains the signature of regularization where a is the JR-regularization parameter. In the 
following we would set the coupling constant e = 1. For the case a = 1, the above theory is seen 
to possess a set of four second-class constraints [7-9] : fli = Ho ^ 0 ; ^2 = 11 -f Ai ) 0 ; 

O 3 = F « 0 O 4 = (“H — — 2 Ai -h Ao) « 0. Where Oi is a primary constraint and O2, Oz, ^ 4 , 

are the secondary constraints. For the case a ^ 1, however, the theory possesses a set of only two 
second-class constraints : Ci = n© ^ 0 5 C 2 = [B-i-cff-hll + Ai + (a — l)Ao] « 0 . As a consequence of 
the second -class nature of the constraints Oi (for the case a = 1 ) and of the constraints Q (for the 
case a ^ 1 ), the model describes in the IF of dynamics, for both the cases a = 1 and a ^ 1, a GNI 
theory [7-9]. For constructing a GI theory corresponding to the GNI model £^, we calculate the 
appropriate ST/WZT [35] £‘^ for the theory, the addition of which to £^ restores the GI to the 
theory. The ST/WZT, for the JR-CSM, e.g., for the case a = 1 , is obtained by enlarging the Hilbert 
space of the GNI theory, by introducing a new field 9 called the Stueckelberg/Wess-Zumino field 
through the following redefinition of fields <j> and A^ in : (f> — y {(j) — 6) and A^ — y (A^ + d*^9). 
The ST/WZT thus obtained is [7-9] : 

= (Ml + ffAo). ( 12 ) 

The new GI theory so obtained is defined by the LD : = {£^ + £^) and is seen to possess a 

set of four first-class constraints : ^i = IIo « 0 , ^>2 = (H^ - Ai) w 0 , V's = {E+<f>+Il+Ai - ^ « 0 
and rl>4 = E m 0. The matrix of PB’s of ■.pi is seen to be singular implying that the theory is GI. 
The theory is indeed invariant under the LVGT [9] : 5(p = -P, SAo — P, 5Ai = P', 59 = -p, 
ffll = SEe = P", SE = JIIo = 0. The theory could thus be quantized under some appropriate 
gauge choice. However, in order to recover the physical contents of the original GNI theory, we 
go to a special gauge d^9 = 0 (or equivalently, 9 = 0 and —E = 0), and accordingly we choose 
the GFC’s : Ci = —9'= 0 and ^2 = (— H — <p'— 2Ai + Ag) « 0. It is easy to see that £^ under 
the above gauge, reproduces precisely the quantum system described by £^. So that this gauge 
translates the GI version of the theory into the GNI one. Infact, the physical Hilbert spaces of the 



Gauge Symmetry in CED 


483 


different than those of the class of models called JR-CSM’s and may be taken as a signature of 
new regularization [11-14]. This theory in contrast with the JR-CSM is seen to possess a self-dual 
boson which could also be thought of as a chiral fermion. 

Yet another important CSM is due to Haxada [15], who has, in particular, constructed a gauged 
Floreanini-Jackiw [16] action which describes a CSM [15-19] in terms of chiral bosonization (as 
explained below) [15]. This model is seen to possess a set of three second-class constraints and 
consequently it describes a GNI or a gauge anomalous theory. An appropriate ST/WZT for 
this GNI-theory in the IF has been calculated in Ref. [18] where the Hamiltonian and BRST 
formulations of the resulting GI (and consequently a gauge non-anomalous theory) have also been 
studied. Now while solving the minimal CSM (the CSM where the right-handed (or equivalently 
the left-handed) fermion is absent), one is faced with a technical difficulty. In order to construct 
an operator solution, one usually makes use of the knowledge of a solution of the bosonized model. 
Because the minimal CSM contains only the left-handed fermion, one is not able to obtain its 
bosonized form by the usual bosonization, and one is led to consider its chiral bosonization [15]. 
Harada [15] has considered the chiral bosonization of the minimal CSM and ha.s obtained a gauged 
action [15] corresponding to the Floreanini-Jackiw action [16], from the conventional bosonic one 
of the CSM, by imposing the chiral constraint (tt — « 0 (cf. Ref. [15])) in phase space. After 

obtaining this equivalent bosonic action, the equations of motion have been solved in the GNI 
formulation. The bosonic solution is found to be completely satisfactory. Harada has further 
constructed [15] a fermionic operator solution of the minimal CSM in covariant gauges in the 
GI formulation [15]. He found a free chiral fermion (self dual chiral boson) and a free massive 
scalar (boson) with the desired mass, as physical asymptotic fields in a positive-definite Hilbert 
space. The existence of a physical free chiral fermion distinguishes the CSM from the (vector) 
Schwinger model [1-5] (which has no physical asymptotic fermion) and implies that the fermion is 
not confined. This minimal CSM is also found to be completely consistent like the usual JR-CSM 
[6-10]. The Harada’s CSM in the FF [19] is seen to possess a set of three first-class constraints 
and consequently describes a GI theory, in contrast to the IF theory [15,18] which is GNI owing 
to the second-class nature of the set of constraints of the theory. The Hamiltonian and BRST 
formulations of this FF theory have been studied in Ref. [19]. In section 6, we would consider this 
theory in the IF as well as in the FF, in details. 


4. The Generalized Schwinger Model 

The generalized Schwinger model (GSM) which describes the QED in (H-l)-dimension with 
massless fermions is described by the LD [1-5, 7- K)] : 

9i = |(ei - en) ; 92 = + ejj) ; (7b) 

= 7V = (^ Q _°^ ), 7V = -£'“'7^ (7c) 

g>^'' ;= diag{+l, -1) ; = +1 ; ju, = 0, 1 (7d) 

which is equivalent to its bosonized form [1-5,7-10] : 

£i> = [\d^4>d>^<i, + {gx9>^^ -92e>^'')d^4>A^-\F^^F>^’' + ^A^A>^] (8) 

In the rest of the article we would work with the above conventions. The mass term for here 
arises from the regularization ambiguities associated with the definition of the current. The case 
of CSM is obtained from the GSM by setting g^ = g 2 = g (i.e. ej? = 0) ; and = g^a , where 
a is the JR-reguIarization parameter. The case of vector Schwinger model (VSM) is obtained by 
setting gi = 0, 92 = 9 (i-e. cl = eg) ; and M = 0. Here ex, = sr implies a vector-like theory in 
this case [1-5]. Also in the case of VSM, demanding the regularization to be gauge-invariant fixes 
a = 0 i.e. M = 0. On the other hand, in CSM [7-10], no choice of a can make the theory GI and 
therefore a is left as a free parameter. The Hamiltonian and BRST formulations of the vector [1-5] 
and chiral [7-10] theories in the IF have been studied in Refs. [4,9] and in the FF in Refs. [5,10]. 
The VSM in the bosonized form is described by the LD [1-5] : 

£ := [\df,4>df^<j) - gef^‘'dp,4>Ay - \F^pF>^''] ; n,v = 0,l 


( 9 ) 



486 


D S Kulshreshtha 


5u = dodop, Sv = -do0 ; 5n = (5n<, = 6E = OTe = 5n„ = = 0. Where 0 = j3(x, t) is 

an arbitrary function of the coordinates. This GI theory under the gauge : Ci = —did w 0 ; 
C2 = p, + Ai) - (n + n«) + e(n + di<i>)] 0, reproduces precisely the quantum system described 

by (14). The Hamiltonian and BEST quantizations of this GI theory have been studied in the 
IF in Ref. [14]. Further, the original theory (14) in the FF is described by the LD [13] : 

£^ := [d^(l>)(d^(l)) + 2eA+{d+(^) + |(a+^+ - - e^{A-)^ + 2e^A+A-]; 

/i,z/ = 0,l. (16) 

The above theory is seen to possess a set of three second-class constraints Pi : pi = (H"^) » 0 

; p2 = (n~9-</»-26A+)«0; 

p 3 = [S-n"" + 2e^(A“ - A*^)] 0, and is therefore GNI. Corresponding to this GNI theory a 

GI theory could be constructed by the inclusion of an appropriate ST/WZT [13] : 

£^ = [(1 ^ 2e -h 2e^){d^e){d^e) - (1 - 2e)id+(t>){d-e) - {d^e)[d^<}>) 

+2e(e - l)A+(a+^) - ~ 2e^A-{d^e - d-B)] (17) 

The resulting GI theory is seen to possess at the classical level, a LVGS, and it is also seen to 

possess aset of three first-class constraints : := II"^ ^0 '= \Jl—d-^—2eA'^ + (l — 2e)d-9] w 

0 ; and 'ijjz = [d-Il~ + Ug - {1 - 2e)(d^B) -h 5-^ 4 - 2 eA"^] 0, where and ^2 are the primary 

constraints and ipz is the secondary constraint implying that the theory is GL It is indeed seen to 
be invariant under the LVGT [13] : SA'^ = 8^13^ 5A" = 6<j> = ~/3, 56 = -Z?, 5u = 9+5+/?, 

5v = — 9+/3 ; SUA = JII”' = (?II = Slie = = 511,, = 0. Where /? = x'~) is an 

arbitrary function of its arguments. This GI theory under the gauge [13]: Ci = “(9-0) « 0 ; 

C 2 - [n, + 9 - (f> — 2e{e — 1 )A'^ -h 2e^A ] ^ 0, reproduces precisely the quantum GNI system 

described by (16). The Hamiltonian and BUST formulations of this theory have been investigated 
in Ref. [13] under some specific gauge choices, where this GI theory has been constructed through 
the ST/WZT given by (17). 


6. The CSM Due to Harada 

In this section we consider a CSM in terms of chiral bosonization constructed by Harada [15] by 
gauging the Floreanini-Jackiw action [16] and described in the IF by the LD [15,18] : 

£^ = [{do<P - dim<^ + 2e(Ao - Ai)(9i^) “ le^CA, - Ai)^ - 

-h|ae^A^A^] ; /i, 1 / = 0 , 1 . (18) 

This theory is seen to possess a set of three second-class constraints : f2i = H^ 0 ; f 22 = 
(n “ « 0 ; fis = (JET -h 2e^'-j- e^{(a — l)Ao + Aj}] 0. Where fii and ^2 are the primary 

constraints and O 3 is a secondary constraint. The above set of constraints is second-class and 
consequently the theory is GNI. A Gl-theory corresponding to this GNI theory is obtained by the 
addition of an appropriate ST/WZT [18] : 

£S = [{eff- e<l>- ^ + ( 2 ^- 6^) + 2 e (0 - 6) 

-2e{Ao “ Ai)ff-^ |e2(0 ~ 0)2 ^ e^A^ - Ai)(0 - ct>) 

+|a€2(02 + 2 Ao 0 “ 2Ai0'- 0^)] (19) 

The resulting Gl-theory is seen to possess a set of three first-class constraints [18]: 0 

; ^2 = [n ~ ^' 4 - 0] « 0 ; = [FT-h Hi? - 0) « 0. The matrix of PB’s of 'tpi is seen to be singular 

implying that the theory is GI and it is indeed seen to be invariant under the LVGT [18] :S<j> = -/?, 
SAo = /?, 5Ai = 56 = —13, 5n = ^Ho = 5E = 511^ = 0. Further, this GI theory under the gauge 

: Cl = —ff = 0 ; C 2 = [Ho ” ( 2 e - l)<l> + e^{Ao -Ai) — ae^Ao] 0 ; reproduces precisely the physical 
contents of the original quantum GNI theory described by (18) [18]. The Hamiltonian and BRST 
formulations of the GI version of the Harada’s CSM obtained by the inclusion of the appropriate 
ST/WZT have been studied in the IF in Ref, [18]. 

The Haxada’s CSM in the FF reads [19] : 

£ = [(9_<^)(9+(^ “ d^<j>) + 2eA+(9+^ ~ 9-<^) - e2(A+)2 + |(9+A+ - 9- A-)2 

-hae^A+A"“] ( 20 ) 

The VGC for (20) is seen to be conserved, and the theory is seen to possess (at the classical 
level) a LVGS [19]. Also, it is seen to possess a set of three first-class constraints : XI := (n+) « 0 
; X 2 (H ~ 9-<^ — 2eA'^) ^ 0 ; xs •= (9-n 4- ae^ A'^) « 0 . Where xi X 2 sure the 



Gauge Symmetry in CED 


487 


primary constraints and xz is a secondary constraint. The matrix of the PB’s of the constraints 
Xi is singular implying that the set of constraints Xi is first-class and that the theory is GI. The 
theory is indeed seen to be invariant under the LVGT [19]: Scl> = = 5+/3, 

6u = 5v = --ed^P ; Sn = ed-0, SU^ = 50+ = 611^ = SUy = 0. Where p = p{x+, x") 

is an arbitrary function of its arguments. The Hamiltonian and BRST formulations of this theory 
have been studied under some specific gauge choices in Ref. [19]. 


7. The Chiral Bosons 

Towards the end let us discuss in brief two examples of the self-dual fields called chiral bosons which 
represent the chiral Fermi theories. These fields are basic building blocks of the larger theories 
in the CED and the string theories. The first example that we consider is that of the Srivastava 
chiral bosons [20], which are single self-dual fields and are described by the LD [20-22] : 

-h g^^)d^(l)] (21) 

The above theory is seen to possess a set of two second-class constraints :Qi px ^ 0 ; and 
^2 = (n - (t> — A) Pi 0 ; where A = (A^ 4- Ai) ; and consequently the theory is GNL Corresponding 
to this GNI theory a GI theory could be obtained by the addition of an appropriate ST/WZT [21]: 

= [-102 _ . ^22) 
to £^. The resulting Gl-theory is seen to possess a set of two first-class constraints : tpi = 
Px ^ 0 ] and 02 = (H -h Ho) 0 ; and it is indeed seen to be invariant under the LVGT [21] 

: 50 = ±p, 56 = ±P, 5A = ^P, 511 = 5110 = Spx « 0. Now this GI theory under the gauge : 

= —ff 0 ; C 2 = (n^ — O' + + X) Pi 0; reproduces precisely the original GNI quantum system. 

The other example is that of the so-called Siegel chiral bosons which are doubly self-dual and are 
described by the LD [23-26] : 

£ = [1^^-14>^ + Xi^ -<!>?] (23) 

This theory is seen to possess one primary constraint Hi = pa ^ 0 , and one secondary 
constraint ^2 = ^ ^ which is classically equivalent to Q 2 = (H — 0) w 0. The 

matrix of the PB’s of fli and 0.2 is singular, implying that the set of constraints fii and O 2 
is first-class and that the theory is GI. The theory is indeed seen to be invariant under the so- 
called Siegel gauge transformations (which are LVGT) [23-26] : 50 = P{(j) — 0'); 6px = 0; 5A = 
h|(^ + /?)+/3(A-A)-A(/3-/?)];5n = ^[(l + 2A)(0~0)~2A(0'-0) + 2(0~0O(A-A)]-/3I(0-0)]. 
Where the gauge parameter P = P{x^t) is an arbitrary function of its arguments, and A can be 
made equal to any given function of x and t, for an appropriate choice of p. The Hamiltonian 
formulation of the theory has been studied in Ref.. [25] and its BRST formulation in Ref. [26], 
under some specific gauge choices. 


8. Summary and Discussions 

In this article, we have studied the concept of the LVGS as interlinked with the constrained 
dynamics of the various field theory models describing the 2D-CED. The constrained dynamics of 
some specific GSM’s describing the CED have been considered in the IF as well as in the FF of 
dynamics in details. The chiral bosons representing the chiral Fermi theories have also been briefly 
considered towards the end. 

The author thank,.; Dr. Usha Kulshreshtha for her long time collaboration on this work and for 
several useful discussions. 


References 

[ 1 ] J. Schwinger, Phys. Rev., 128, 2425 (1962). 

[2] See e.g., K.R.Ito, Prog. Theor. Phys., 53 , 817 (1975); N.Nakanishi, Prog.Theor. Phys., 57 
, 580 (1977); K.D.Rothe and J.A.Swieca, Phys. Rev., D15, 541 (1977); J.H.Lowenstein and 



488 


D S Kulshreshtha 


A.Swieca, Ann. Phys., 68 , 172 (1971); S.Coleman, R.Jackiw and L.Susskind, Ann. Phys., 
93,267 (1975); S. Coleman, Ann. Phys., 101 , 239 (1976); C.Wotzasek, Acta Phys. Pol.,B21, 
457 (1990); J.McCabe, Phys. Lett., 257B , 145 (1991). 

[3] A.Casher, J.Kogut and L.Susskind, Phys. Rev. Lett., 31, 792(1973); Phys.Rev., DIO , 732 
(1974); J.Kogut and L.Susskind, Phys. Rev., Dll, 3594 (1975); ibid, D13 , 337 (1976). 

[4] Usha Kulshreshtha,D.S.Kulshreshtha and H.J.Mueller-Kirsten, Helv. Phys. Acta, 66, 737 
(1993); Usha Kulshreshtha,D.S.Kulshreshtha and H.J.W. Mueller-Kirsten, “Hamiltonian 
and BRST formulation of some chiral Field Theories in one-space one-time Dimension”, Pro- 
ceedings of II workshop on “Const- raints Theory and Quantization Methods” Montepulciano 
(Siena) Italy, 1993, World Scientific Singapore, 305-327 (1994). 

[5] Usha Kulshreshtha and D.S.Kulshreshtha, Inti. J.Theor. Phys., 37, 2539 (1998). 

[6] C.Hagen, Ann. Phys., (N.Y.), 81, 65 (1973). 

[7] R.Jackiw and R.Rajaraman, Phys. Rev. Lett., 54, 1219 (1985); 54, 2060 (E) (1985); 
R.Rajaraman, Phys. Lett., 154B , 305 (1985); ibid 184B,369 (1987). 

[8] H.O.Girotti, H.J.Rothe and K.D.Rothe, Phys. Rev., D33, 514 (1986); N.K.Falck, 

G.Kramer, Ann. Phys., 176, 330 (1987); P.Mitra and R.Rajaraman, Phys. Lett., 225B, 

267 (1989); Ann. Phys.,203, 157 (1990); D.Boyanovski, NucL Phys., B294, 223(1987); 
R.P.MalikjPhys. Lett., 212B, 445 (1988); P.P.Srivastava, Phys. Lett.,235B, 287 (1990); 
D.Boyanovsky, I.Schmidt and M.F.L.Golterman, Ann. Phys. (N.Y.), 185, 111(1988). 

[9] Usha Kulshreshtha, D.S.Kulshreshtha and H.J.Mueller-Kirsten, Can. J.Phys., 72, 639 (1994); 
11. Nuovo Cim.,A107, 569 (1994). 

[10] Usha Kulshreshtha and D.S.Kulshreshtha, Canad. J. Phys., 77, (1999)-in press. 

[11] . P.Mitra, Phys. Lett., 284B, 23 (1992). 

[12] S.Mukhopadhyay and P.Mitra, Zeit. f. Phys.C97, 525 (1995); S.Mukhopadhyay and P.Mitra, 
Ann, Phys. (N.Y.) 241, 68 (1995). 

[13] Usha Kulshreshtha, Helv. Phys. Acta 71, 353 (1998). 

[14] Usha Kulshreshtha, “A Gauge Invariant Chiral Schwinger Model with the Faddeevian Regu- 
larization: The Stueckelberg Term, Hamiltonian and BRST Formulations”, Delhi University, 
Preprint-1999. 

[15] K. Haiada, Phys. Rev. Lett., 64, 139 (1990); Phys. Rev., D42, 4170 (1990). 

[16] R.Fioreanini, R.Jackiw, Phys. Rev, Lett., 59, 1873 (1987), 

[17] J.K.Kim,W.T.Kim,W.H.Kye, Phys. Rev., D42, 4170 (1990); Phys. Rev. D45, 717 (E) (1992); 
W.H.Kye, W.T.Kim, J.K.Kim, Phys. Lett., B268, 59 (1991). 

[18] Usha Kulshreshtha, D.S.Kulshreshtha and H.J.Mueller-Kirsten, Zeit. f. Phys., C64, 169 
(1994). 

[19] Usha Kulshreshtha, Canad. J. Phys. 78, (2000)-in press. 

[20] P.RSrivastava, Phys. Rev. Lett., 63, 2791 (1989); Phys. Lett., 234B, 93 (1990). 

[21] Usha Kulshreshtha, D.S.Kulshreshtha and H.J.Mueller-Kirsten, Zeit. f. Phys., C60, 427 
(1993). 

[22] Usha Kulshreshtha, D.S.Kulshreshtha and H.J.Mueller-Kirsten, Phys. Rev., D45, R393 

( 1 ^ 2 ). 



Gauge Symmetry in CED 


489 


[23] W.Siegel, Nucl. Phys., B238, 307 (1984). 

[24] J.M.F.Labastida and M.Pernichi, Phys. Rev. Lett., 59, 2511 (1987); Nucl. Phys. B297, 557 
(1988); S.Belluchi, M.F.L.Goltermann and D.N. Petcher, Nucl. Phys. B326, 307 (1989). 

[25] Usha Kulshreshtha, D.S.Kulshreshtha and H. J.Mueller^Kirsten, Phys. Rev., D4T, 4634 (1993). 

[26] Usha Kulshreshtha and D.S.Kulshreshtha, Inti. J. Theor. Phys., 38,1399 (1999). 

[27] V.Rubakov, Nucl. Phys., B203, 311(1982); C.G.Callan, Phys. Rev., D25, 2141 (1982). 

[28] J.L.Cardy, “Phase Transitions and Critical Phenomena”, Vol.ll, Academic Press, London 
(1985). 

[29] M.B. Green, J.H.Schwarz and E.Witten, “Superstring Theory”, Vol.l & 2,Cambridge Univer- 
sity Press, Cambridge (1987). 

[30] C.Itzykson and J.Zuber, “Quantum Field Theory”, Inti. Series in Pure and Applied Physics, 
McGraw-Hill (1980). 

[31] M.Henneaux and C.Teitelboin, “Quantization of Gauge Systems” , Prince ton University Press, 
Princeton, New Jersey (1992). 

[32] P.A.M.Dirac, Can. J. Math., 2, 129 (1950); “Lectures on Quantum Mechanics”, Belfer Grad- 
uate School of Science, Yashiva University, New York, 1964. 

[33] C.Becchi, A.Rouet and R.Stora, Phys. Lett., 52B, 344 (1974) ; V.Tyutin Lebedev Report No. 
FIAN-39 (1975), D.Nameshchansky, C.Preitschopf and M.Weins- tein, Ann. Phys., (N.Y.), 
183, 226 (1988) ;D.M.Gitman, “Quantization of Fields with Constraints”, Springer- Verlag 
(1990). 

[34] P.A.M.Dirac, Rev. of Mod. Phys., 21, 392 (1949). 

[35] E.C.G.Stueckelberg, Helv. Phys. Acta, 14, 52 (1941); Helv. Phys. Acta 30, 209 (1957); 
J.Wess and B.Zumino, Phys. Lett.,37B, 95 (1971); E.Witten, Nucl. Phys., B223, 422 
(1983); L.D.Faddeev,Phys. Lett., 145B, 81 (1984); L.D.Faddeev and S.L. ShatashvUi, 
Phys. Lett., 167B, 225 (1986); P.Mitra and R.Rajaraman, Phys. Lett., 225B, 267 (1989); 
Ann. Phys. (N.Y.),203, 157 (1990). 



19. Towards a Unified Description of the Four 
Interactions in Terms of Dirac-Bergmann 

Observables. 

Luca Lusanna * 

[3mni]Sezione INFN di Firenze, 

Largo E.Fermi 2, 50125 Firenze, Italy 


Abstract 

A review is given of the status and developments of the research program aiming to refor- 
mulate the physics of the four interactions at the classical level in a unified way in terms of 
Dirac-Bergmann observables with special emphasis on the open mathematical, physical and 
inter pretational problems. 


1 Introduction 

At the classical level the accepted mathematical description of the four interactions at the basis 
of our understanding of nature (gravitational, electromagnetic, weak and strong; without or with 
the not yet experimentally verified supersymmetry between half-integer and integer spin fields, 
i.e. between fermions and bosons) , is based on action principles which, due to manifest Lorentz 
invariance, to local gauge invariance (minimal coupling) and/or diffeomorphism invariances, make 
use of singular Lagrangians implying the Dirac-Bergmann theory of constraints [1, 2] for their 
Hamiltonian formulation. While behind the gauge freedom of gauge theories proper there are Lie 
groups acting on some internal space so that the measurable quantities must be gauge invariant, 
the gauge freedom of theories invariant under diffeomorphism groups of the underlying spacetime 
(general relativity, string theory and reparametrization invariant systems of relativistic particles) 
concerns the arbitrariness for the observer in the choice of the definition of “what is space and/or 
time” (and relative times in the case of particles), i.e. of the definitory properties either of spacetime 
itself or of the measuring apparatuses. This is the classical mathematical background on which 
our imderstanding of the quantum field theory of electromagnetic, weak and strong interactions 
in the modern BRS formulation is based. The same is true for our attempts to build quantum 
gravity notwithstanding our actual incapacity to reconcile the influence of gravitational physics on 
the existence and formulation of spacetime concepts with the basic ideas of quantum theory, which 
requires a given absolute background spacetime. 

Current research on electromagnetic, weak and strong interactions in special relativity, namely 
in Minkowski spacetime, has partially bypassed the problem by the covariant approach based on 
the BRS symmetry which, at least at the level of the algebra of infinitesimal gauge transformations, 
allows a regularization and renormalization of the relevant theories inside the framework of local 
quantum field theory (see for instance Ref.[3]). However, problems like the understanding of finite 
gauge transformations and of the associated moduli spaces, the Gribov ambiguity dependence on 
the choice of the function space for the fields and the gauge transformations, the confinement of 
quarks, the definition of relativistic bound states and how to put them among the asymptotic states, 
the nonlocality of charged states in quantum electrodynamics, not to speak of the foundational 
and practical problems posed by gravity, suggest that we should revisit the foundations of our 


•Email: Iusanna@fi.infn.it 



Dirac-Bergmann Constrained Dynamics 


491 


theories. It is not yet known whether we can understand which are the physical degrees of freedom 
hidden behind manifest gauge and/ or general covariance and whether we can firstly meaningfully 
reformulate classical physics in terms of them and secondly to quantize the resulting theories. 
This will require to abandon local field theory at the nonperturbative level and to understand 
how to regularize and renormalize the Coulomb gauge of electrodynamics to start with. Moreover, 
the special relativistic theories will have to be reformulated in such a way to allow a natural 
transition to the coupling to gravity. Even if usually gravitational contributions are ignored because 
they are too weak with respect to the other interactions, the existing solution to the ultraviolet 
divergences of quantum field theory is distributional, so that, at least at the mathematical level, 
it is not justified to ignore gravity with ail its nonlinearities. In turn general relativity must be 
formulated in a way allowing its deparametrization to recover physics in Minkowski spacetime when 
the Newton constant is put equal to zero. One also needs a formulation in which some notion of 
elementary particle exists so to recover Wigner’s definition based on the irreducible representations 
of the Poincare group in Minkowski spacetime with the further enrichment of the known good 
quantum numbers for their classification. Moreover, one needs some way out from the “problem of 
time” [4, 5, 6], since neither any consistent way to quantize time (is it a necessity?), and generically 
any timelike variable, nor a control on the associated problem of the relative times of a system of 
relativistic particles are known. Finally, one has to find a solution to the more basic problem of 
how to identify physically spacetime points in Einstein’s formulation of general relativity, where 
general covariance deprives the mathematical points of the underlying 4-manifold of any physical 
reality [7, 8], while, on the experimental side (space physics, gravitational waves detectors) , we 
are employing a theory of measurements of proper times and spacelike lengths which presuppones 
the individuation of points. This problem will appear also in the nowaday most popular program 
of unification of all the interactions in a supersymmetric way, i.e. in superstring theory and in 
its searched M-theory extension (see for instance Ref. [9]; string theory will not be touched in this 
review), when someone will be able to reformulate it in a background independent way. 

These motivations induced me to revisit the classical Hamiltonian formulation of theories de- 
scribed by singular Lagrangians trying to choose the mathematical frameworks which at each step 
looked more natural to clarify the physical interpretational problems by means of the use of suitable 
adapted coordinates. In particular, after many years of dominance of the point of view privileging 
manifest Lorentz, gauge and general covariance at the price of loosing control on the physical de- 
grees of freedom and on their deterministic evolution (felt as a not necessary luxury only source of 
difficulties and complications), I went back to the old concept of Dirac observables, namely of those 
gauge invariant deterministic variables which describe a canonical basis of measurable quantities 
for the electromagnetic, weak and strong interactions in Minkowski spacetime. Instead, in general 
relativity, due to the problem of the individuation of the points of spacetime, measurable quan- 
tities have a more complex identification, which coincides with Dirac’s observables (in any case 
indispensable for the treatment of the Cauchy problem) only in a completely fixed gauge (total 
breaking of general covariance). 

In the next Sections I will review the various achievements of the program at the present stage 
of development (see Refs.[10] for previous reviews). Since there is too vast a bibliography to be 
covered in this review, I made the choice to concentrate it on my point of view omitting to quote 
many aspects of the theory and the work of many researchers. 


1 Singular Lagrangians, Presymplectic Geometry, the Shan- 
mugadhasan Canonical Transformations and Generalized 
Coulomb Gauges in Minkowski Spacetime. 

A) If a finite-dimensional system with configuration space Q [g*, i=l,..,N, are local coordinates in 
a global (assumed to exist for the sake of simplicity) chart of the atlas of Q; is a point 

in X (3, where R is the time axis; g‘(t) = dg^(t)/dt] is described by a singular Lagrangian L [so 

that the Hessian matrix is degenerate: det (d'^LIdq'dq^ = 0], its Euler-Lagrange equations axe in 



492 


Luca Lusanna 


general a mixture of equations i) depending only on the (holonomic constraints); ii) depending 
only on and q^ (Lagrangian, in general nonholonomic, constraints and/or intrinsic first order 
equations of motion violating the so called second order differential equation (SODE) conditions); 
iii) depending on q\ (genuine second order equations of motion, which however cannot be 

put in normal form, i.e. solved in the q^). More equations of the types i) and ii) can be deduced 
from the Euler-Lagrange equations and their time derivatives. The study of this type of degenerate 
equations can be traced back to Levi-Civita[ll]. The solutions of the Euler-Lagrange equations 
depend on arbitrary functions of time, namely they are not deterministic. 

The canonical momenta Pi = dLjdq^ are not independent: there are relations among them 
(paiQiP) ^ 0 called primary Hamiltonian constraints, which define a submanifold 7 of the cotangent 
space T*Q [the model is defined only on this submanifold; one uses the Poisson brackets of T*Q in a 
neighbourhood of 7 and Dirac’s weak equality « means that the equality sign cannot be used inside 
Poisson brackets]. The canonical Hamiltonian Hc{q^p) has to be replaced by the Dirac Hamiltonian 
Hd = jff c + which knows the restriction to the submanifold 7 due to the arbitrary 

Dirac multipliers AQ:(i). The time constancy of the primary constraints , dt(l>a = ^ 0, 

either pruduces secondary Hamiltonian constraints or determines some of the Dirac multipliers. 
This procedure is repeated for the secondary constraints (this is the Dirac- Bergmann algorithm) 
and so on. At the end there is a final set of constraints Xa ^ 0 defining the final submanifold 7 of 
T*Q on which the dynamics is consistently restricted, and a final Dirac Hamiltonian with a reduced 
set of arbitrary Dirac multipliers describing the remaining indetermination of the time evolution. 
The constraints are divided into two subgroups: i) the first class ones Xm^ ^ 0, having weakly zero 
Poisson bracket with all constraints and being the generators of the gauge transformations of the 
theory (the associated vector fields {.,Xm^} are tangent to 7); ii) the second class ones Xn^ ^ 0 
(their number is even) with det (^{Xni yXn^}^ # 0? corresponding to pairs of inessential eliminable 

variables (the associated vector fields are normal to 7). The solutions of the Hamilton-Dirac 
equations with the final Dirac Hamiltonian depend on as many arbitrary functions of time as the 
left Dirac multipliers. The restriction of the symplectic 2-form of T*Q to 7 is a closed degenerate 
2-form, which in case of only first class constraints generates a so called presymplectic geometry: 
7 is said to be a presymplectic manifold coisotropically embedded in T’^Q [see Ref.[12, 13] for 
what is known on presymplectic structures (they are dual to Poisson structures, but much less 
studied not being connected with integrable systems) and on the more general ones when second 
class constraints are present]. When many mathematical conditions are satisfied, the vector fields 
associated with the first class constraints (they are in the kernel of the degenerate 2-form on 7) 
generate a foliation of the submanifold 7: each leaf (Hamiltonian gauge orbit) contains all the 
configurations which are gauge equivalent and which have to be considered as the same physical 
configuration[l] (equivalence class of gauge equivalent configurations); the canonical Hamiltonian 
He (if it is not He « 0) generates an evolution which maps one leaf into the others. Therefore, the 
physical reduced phase space is obtained: i) by eliminating as many pairs of conjugate variables 
as second class constraints by means of the so called associated Dirac brackets; ii) by going to the 
quotient with respect to the foliation (a representative of the reduced phase space can be build 
by adding as many gauge-fixing constraints as first class ones, so to obtain a set of second class 
constraints). In general this procedure breaks the original Lorentz invariance. 

Let us remark that only the primary first class constraints are associated with arbitrary Dirac 
multipliers. The secondary, tertiary... first class constraints are, in general, present in the canonical 
Hamiltonian He multiplied by well defined functions of q\ q^, which turn out to be arbitrary 
because they are not determined by the Hamilton-Dirac equations (they are gauge variables). This 
contradicts the Dirac conjecture[l] that the secondary first class constraints can be added to the 
Dirac Hamiltonian with extra multipliers (the resulting extended Dirac Hamiltonian would not 
allow the reconstruction of the original singular Lagrangian by inverse Legendre transformation; 
since the difference in the dynamics is only off-shell, this explains why the extended Hamiltonian 
is used in the BFV approach[14]). The natural way to add gauge-fixing constraints when there are 
secondary first class constraints[15], is to start giving the gauge fixings to the secondary constraints. 
The requirement of time constancy of these gauge fixings will generate the gauge fixings for the 



Dirac-Bergmann Constrained Dynamics 


493 


primary first class constraints and the time constancy of these new gauge fixings will determine 
the Dirac multipliers eliminating every residual gauge freedom. 

The Dirac observables are the gauge invariant functions on the reduced phase space, on which 
there is a deterministic evolution generated by the projection of the canonical Hamiltonian. There- 
fore, the main problem is to find a (possibly global) Darboux coordinate chart of the reduced phase 
space, namely a canonical basis of Dirac observables (or at least a Poisson algebra of them, ac- 
cording to Ref.[16] ). 

One would expect that when this is not possible, the relativistic system is intrinsically ill defined 
already at the classical level: at the quantum level this should manifest itself with the presence of 
not curable anomalies (which can be present also for a classically well defined system). Since the 
mathematical theory of the anomalies relies on cohomological properties of the manifolds (like Q 
and 7) relevant to the description of the system, which have to be defined already at the classical 
level, one expects that a classical background of these properties in the form of obstructions to 
the determination of the observables should be present in the theory of classical gauge canonical 
transformations. 

When there is reparametrization invariance of the original action 5 = / dtL^ the canonical 
Hamiltonian vanishes and the reduced phase space is said frozen (like it happens in Hamilton- 
Jacobi theory). When the canonical Hamiltonian vanishes, both kinematics and dynamics are 
contained in the first class constraints describing the system: these can be interpreted as generalized 
Hamilton- Jacobi equations[17], so that the Dirac observables turn out to be the Jacobi data. When 
there is a kinematical symmetry group, like the Galileo or Poincare groups, an evolution may be 
reintroduced by using the energy generator as Hamiltonian. 

In a series of papers[18, 19, 20, 21, 22, 23] I made a reformulation of the general theory of 
singular Lagrangians and Hamiltonian constraints based on an extension of the second Noether 
theorem[24] to include also second class constraints. By means of the resulting Noether identities 
the Dirac-Bergmann algorithm was reproduced at the Lagrangian level. All the obscure and/or 
ambigous points of the theory were clarified. The understanding[19] of the pathological examples 
known in the literature led to the discovery of third- and fourth-class constraints [with their 
associated singularities of the Jacobi equations (linearization of the Euler-Lagrange equations) 
and their connection with the reject of the Dirac conjecture about adding the secondary first 
class constraints to the Dirac Hamiltonian with extra Dirac multipliers] and of the phenomena of 
proliferation of constraints, ramification and joining of chains of constraints. Also the classification 
of all possible patterns of second class constraints was given[23]. All these phenomena have their 
counterpart in the study of the Euler-Lagrange equations for a singular Lagrangian in the second- 
order formalism. In Ref. [22] there is also the status of the art for the much more difficult and still 
incomplete first-order formulation of the theory on the tangent space TQ or on the first jet bundle 
J^{Q) « TQ X R, while in Ref.[21] there is the connection with BRS theory. 

B) Now I will delineate the main steps for the determination of the Dirac observables for the 
case in which only primary first class constraints « 0 are present at the Hamiltonian level. 

The Euler-Lagrange equations associated with a singular Lagrangian do not determine the 
gauge part of the extremals. However it cannot be totally arbitrary, but must be compatible 
with the algebraic properties of the Noether gauge transformations induced by the first class 
constraints under which the action is either invariant or quasi-invariant as implied by the second 
Noether theorem. In the Hamiltonian formulation these properties are contained in the structure 
constants, or functions, of the Poisson brackets of the first-class constraints among themselves 
[{(j)a, M = {(j> 0 L,Hc} = Cai 34 >(s] and the gauge arbitrariness of the trajectories is described 

by the Dirac multipliers appearing in the Dirac Hamiltonian. In both formulations one has to add 
extra equations, the either Lagrangian or Hamiltonian multitemporal equations[20], to have a 
consistent determination of the gauge part of the trajectory (see the generalized Lie equations of 
Ref.[25]). These equations are obtained by rewriting the variables q^t), Pi{t) in the form g*(t, r^), 
jPt(^5 Tc^), and by assuming that the original t-evolution generated by the Dirac Hamiltonian Hb = 
He + Yja ^a(^)0a is replaced by: i) a deterministic t-evolution generated by Hc\ ii) a Ta-evolution 
(reassorbing the arbitrary Dirac multipliers A<^(t)), for each 0:, generated in a suitable way by 
the first class constraints The Ta-dependence of pi determined by these multitemporal 



494 


Luca Lusanna 


(or better multiparametric) equations, which are integrable due to the first-class property of the 
constraints, describes their dependence on the gauge orbit containing the given Cauchy data for the 
Hamilton-Dirac equations. i¥)com the point of view of the study of the multitemporal equations, 
the secondary first class constraints are treated like the primary ones, namely as if there would be 
associated extra Dirac multipliers, and one should use as canonical Hamiltonian He restricted to 
zero value of the secondary constraints. 

When the Poisson brackets of the Hamiltonian first class constraints imply a canonical realiza- 
tion of a Lie algebra, the extra Hamiltonian multitemporal equations have the first class constraints 
as Hamiltonians (so that the Dirac Hamiltonian is reduced to the canonical Hamiltonian) and the 
time parameters (replacing the Dirac multipliers) are the coordinates of a group manifold for a Lie 
group whose algebra is the given Lie algebra: they enter in the multitemporal equations via a set of 
left invariant vector fields Ya on the group manifold [Fc«A(g,p) = {A(g,p), </>«}]• In the ideal case 
in which the gauge foliation of 7 is nice, all the leaves (or gauge orbits) are diffeomorphic and in 
the simplest case all of them are diffeomorphic to the group manifold of a Lie group. In this ideal 
case to rebuild a gauge orbit from one of its points (and therefore to determine the gauge part of 
the trajectories passing through that point) one needs the Lie equations associated with the given 
Lie group: the Hamiltonian multitemporal equations are generalized Lie equations describing all 
the gauge orbits simultaneously. In a generic case this description holds only locally for a set of 
diffeomorphic orbits, also in the case of systems invariant under diffeomorphisms. 

Once one has solved the multitemporal equations, the next step is the determination of a Shan- 
mugadhasan canonical transformation[26]. In the finite dimensional case general theorems[27] 
connected with the Lie theory of function groups[28] ensure the existence of local canonical trans- 
formations from the original canonical variables g% Pi, in terms of which the first class constraints 
(assumed globally defined) have the form <^a(gjP) ^ Oj to canonical bases Qay Pa^ Qa} such 
that the equations P^ « 0 locally define the same original constraint manifold (the Fa are an 
Abelianization of the first class constraints); the Qa are the adapted Abelian gauge variables de- 
scribing the gauge orbits (they are a realization of the times of the multitemporal equations 
in terms of variables g% pi); the Qa^ Pa an adapted canonical basis of Dirac observables. 
These canonical transformations are the basis of the Hamiltonian definition of the Faddeev-Popov 
measure of the path integral[29] and give a trivialization of the BRS construction of observables 
(the BRS method works when the first class constraints may be Abelianized[30]). Therefore the 
problem of the search of the Dirac observables becomes the problem of finding Shanmugadhasan 
canonical transformations. The strategy is to find abelianizations Pa of the original constraints, to 
solve the multitemporal equations for g*, pi associated with the P^, to determine the multitimes 
Qa = Ta and to identify the Dirac observables Pa, Qa from the remaining original variables, 
i.e. from those their combinations independent from P^ and Qa- Second class constraints, when 
present, are also taken into account by the Shanmugadhasan canonical transformation[26]. 

Putting equal to zero the Abelianized gauge variables one defines a local gauge of the model. If a 
system with constraints admits one (or more) global Shanmugadhasan canonical transformations, 
one obtains one (or more) privileged global gauges in which the physical Dirac observables are 
globally defined and globally separated from the gauge degrees of freedom [for systems with a 
compact configuration space Q this is impossible]. These privileged gauges (when they exist) can 
be called generalized Coulomb gauges. When the system under investigation has some global 
symmetry group, the associated theory of the momentum map[31] is a source of globality. 

C) Now all the physical systems defined in the flat Minkowski spacetime, have the global 
Poincare’ symmetry. This suggests to study the structure of the constraint manifold 7 from the 
point of view of the orbits of the Poincare’ group. If is the total momentum of the system, 
the constraint manifold has to be divided in four strata (some of them may be absent for certain 
systems) according to whether p^ > 0, = 0, < 0 or ^ = 0. Due to the different little groups 

of the various Poincare’ orbits, the gauge orbits of different sectors will not be diffeomorphic. 
Therefore the manifold 7 is a stratified manifold and the gauge foliations of relativistic systems 
are nearly never nice, but rather one has to do with singular foliations. 

For an acceptable relativistic system the stratum p^ < 0 has to be absent to avoid tachyons. To 
study the strata p^ = 0 and p^ = 0 one has to add these relations as extra constraints. For all the 



Dirac-Bergmami Constrained Dynamics 


495 


strata the next step (see however the next Section) is to do a canonical transformation from the 
original variables to a new set consisting of center-of-mass variables and of variables relative 
to the center of mass. Let us now consider the stratum > 0. By using the standard Wigner 

boost L^{p,p) (p^ = L^{p,p)p , p = 77-y/^(l;0), rj = signp^)^ one boosts the relative variables 
at rest. The new variables are still canonical and the base is completed by p^ and by a new 
center-of-mass coordinate differing from for spin terms. The variable has complicated 
covariance properties; instead the new relative variables are either Poincare’ scalars or Wigner 
spin-1 vectors, transforming under the group 0(3) (p) of the Wigner rotations induced by the 
Lorentz transformations. A final canonical transformation[32], leaving fixed the relative variables, 
sends the center-of-mass coordinates p^ in the new set p • xlr}\/^ = p • (the time 

in the rest frame), rjy/^ (the total mass), k = p/py^ (the spatial components of the 4- velocity 

= p^/py /^ , = 1), z = py^{x — x^p/p^). z is a noncovariant center-of-mass canonical 
3-coordinate multiplied by the total mass: it is the classical analog of the Newton- Wigner position 
operator (like it, z is covariant only under the little group 0(3)(p) of the timelike Poincare orbits). 
Analoguous considerations could be done for the other sectors. In Ref. [33] there is the definition 
of other canonical bases, the spin bases, adapted to the spin Casimir of the Poincare group. 

The nature of the relative variables depends on the system. The first class constraints, once 
rewritten in terms of the new variables, can be manipulated to find suitable global and Lorentz 
scalar Abelianizations. Usually there is a combination of the constraints which determines py/j^: 
i.e. the mass spectrum, so that the time in the rest frame p • xjpy/^ is the conjugated Lorentz 
scalar gauge variable. The other constraints eliminate some of the relative variables (in particular 
the relative energies for systems of interacting relativistic particles and the string) : their conjugated 
coordinates (the relative times) are the other gauge variables: they are identified with a possible 
set of time parameters by the multitemporal equations. The Dirac observables (apart from the 
center-of-mass ones k and z) have to be extracted from the remaining relative variables and the 
construction shows that they will be either Poincare’ scalars or Wigner covariant objects. In this 
way in each stratum preferred global Shanmugadhasan canonical transformations are identified, 
when no other kind of obstruction to globality is present inside the various strata. 

D) In gauge field theories the situation is more complicated, becouse the theorems ensuring the 
existence of the Shanmugadhasan canonical transformation have not been extended to the infinite- 
dimensional case. One of the reasons is that some of the constraints can now be interpreted as 
elliptic equations and they can have zero modes. Let us consider the stratum p^ > 0 of free Yang- 
Mills theory as a prototype and its first class constraints, given by the Gauss laws and by the 
vanishing of the time components of the canonical momenta. The problem of the zero modes will 
appear as a singularity structure of the gauge foliation of the allowed strata, in particular of the 
stratum > 0. This phenomenon was discovered in Ref. [34] by studying the space of solutions 
of Yang-Mills and Einstein equations, which can be mapped onto the constraint manifold of these 
theories in their Hamiltonian description. It turns out that the space of solutions has a ’’cone over 
cone” structure of singularities: if we have a line of solutions with a certain number of symmetries, 
in each point of this line there is a cone of solutions with one less symmetry. In the Yang-Mills 
case the “gauge symmetries” of a gauge potential are connected with the generators of its stability 
group, i.e. with the subgroup of those special gauge transformations which leave invariant that 
gauge potential (this is the Gribov ambiguity for gauge potentials; there is also a more general 
Gribov ambiguity for field strengths, the “gauge copies” problem). Since the Gauss laws are the 
generators of the gauge transformations (and depend on the chosen gauge potential through the 
covariant derivative), this means that for a gauge potential with non trivial stability group those 
combinations of the Gauss laws corresponding to the generators of the stability group cannot be any 
more first class constraints, since they do not generate effective gauge transformations but special 
symmetry transformations. This problematics has still to be clarified, but it seems that in this 
case these components of the Gauss laws become third class constraints, which are not generators 
of true gauge transformations. This new kind of constraints was introduced in Refs. [19, 22] in the 
finite dimensional case as a result of the study of some examples, in which the Jacobi equations 
(the linearization of the Euler-Lagrange equations) are singular, i.e. some of their solutions are 



496 


Luca Lusanna 


not infinitesimal deviations between two neighbouring extremals of the Euler-Lagrange equations. 
This interpretation seems to be confirmed by the fact that the singularity structure discovered 
in Ref. [34] follows from the existence of singularities of the linearized Yang-Mills and Einstein 
equations. These problems are part of the Gribov ambiguity, which, as a consequence, induces an 
extremely complicated stratification and also singularities in each Poincare stratum of 7 . 

Other possible sources of singularities of the gauge foliation of Yang-Mills theory in the stratum 

> 0 may be: i) different classes of gauge potentials identified by different values of the field 
invariants; ii) the orbit structure of the rest frame (or Thomas) spin 5, identified by the Pauli- 
Lubanski Casimir of the Poincare’ group. 

The final outcome of this structure of singularities is that the reduced phase-space, i.e. the 
space of the gauge orbits, is in general a stratified manifold with singularities[16]. In the stratum 
p^ > 0 of the Yang-Mills theory these singularities survive the Wick rotation to the Euclidean 
formulation and it is not clear how the ordinary path integral approach and the associated BRS 
method can take them into account. The search of a global canonical basis of Dirac observables 
for each stratum of the space of the gauge orbits can give a definition of the measure of the phase 
space path integral, but at the price of a non polynomial Hamiltonian. Therefore, if it is not 
possible to eliminate the Gribov ambiguity (assuming that it is only a mathematical obstruction 
without any hidden physics), the existence of global Dirac observables for Yang-Mills theory is 
very problematic. 

E) Firstly, inspired by Ref. [35] where a canonical basis of Dirac observables was found for 
the electromagnetic field interacting with a fermion field (whose Dirac observable is a fermion 
field dressed with a Coulomb cloud), the canonical reduction to noncovariant generalized Coulomb 
gauges, with the determination of the physical Hamiltonian as a function of a canonical basis of 
Dirac’s observables, has been achieved for the following isolated systems (for them one asks that 
the 10 conserved generators of the Poincare algebra are finite so to be able to use group theory; 
theories with external fields can only be recovered as limits in some parameter of a subsystem of 
the isolated system): 

1 ) Relativistic particle mechanics. Its importance stems from the fact that quantum field 
theory has no particle interpretation: this is forced on it by means of the asymptotic states of 
the reduction formalism which correspond to the quantization of independent one-body systems 
described by relativistic mechanics [or relativistic pseudodassical mechanics [36], when one adds 
Grassmann variables to describe the intrinsic spin]. Besides the scalar particle {p^ - « 0 or 

^ 0), one has control on: i) the pseudodassical electron[37] 0 or pjj,^^ « 0, 

where (^,(5 axe Grassmann variables; 0 or « 0 are implied; after quantization the 

Dirac equation is reproduced); ii) the pseudodassical neutrino[38] H- f ^ 0? 

p^ 0, giving the Weyl particle wave equation - 75 )^(a:) = 0 after quantization); hi) 

the pseudodassical photon[39] (p^ « 0, Pfi^^ « 0, « 0, « 0, where 9*^ are a pair 

of complex Grassmann four- vectors to describe helicity ± 1 ; after quantization one obtains the 
photon wave equations UA^{x) = 0, d^A^{x) = 0; the Berezin-Marinov Grassmann distribution 
function allows to recover the classical polarization matrix of classical light and, in quantization, the 
quantum polarization matrix with the Stokes parameters); iv) the vector particle or pseudodassical 
massive photon[40] [p^ — p^ + (l- A)p^0’^^Pi,0^ ^ 0, 9*^9^ w 0, which, after quantization, reproduce 
the Proca-like wave equation (0 + ij?)A^{x) ~ (1 — X)d^dyA^{x) — 0]. 

The most important two-body system is the DrozVincent-Todorov-Komar model [41] with an 
arbitrary action-at-a-distance interaction instantaneous in the rest frame as shown by its energy- 
momentum tensor[42] [pf - H- V{r\) ^ 0, i=l, 2 , ^ Xi - 

PpL =Pi/£ -bp 2 Ai]. This model has been completely understood both at the classical and quantum 
level [32]. Its study led to the identification of a class of canonical transformations (utilizing the 
standard Wigner boost for timelike Poincare orbits) which allowed to understand how to define 
suitable center-of-mass and relative variables (in particular a suitable relative energy is determined 
by a combination of the two first class constraints, so that the relative time variable is a gauge 
variable), how to find a quasi-Shanmugadhasan canonical transformation adapted to the constraint 
determining the relative energy, how to separate the four, topologically disjoined, branches of the 
mass spectrum (it is determined by the other independent combination of the constraints; therefore, 



Dirac-Bergmann Constrained Dynamics 


497 


there is a distinct Shanmugadhasan canonical transformation for each branch). At the quantum 
level it was possible to find four physical scalar products, compatible with both the resulting 
coupled wave equations (i.e. independent from the relative and the absolute rest-frame times): 
they have been found as generalization of the two existing scalar products of the Klein-Gordon 
equation: all of them are non-local even in the limiting free case and differ among themselves for 
the sign of the norm of states on different mass-branches. This example shows that the physical 
scalar product knows the functional form of the constraints. 

The connection with the Bethe-Salpeter equation of the quantized model has been studied in 
Ref. [43], where it is shown that the constraint wave function can be obtained from the Bethe- 
Salpeter one by multiplication for a delta function containing the relative energy to exclude its 
spurious solutions (non physical excitations in the relative energy). The extension of the model to 
two pseudoclassical electrons and to an electron and a scalar has been done in Ref. [44], and the 
first was used to get good fits to meson spectra. 

The previous canonical transformations were then extended to N free particles described by N 
mass-shell first class constraints pf — 77if w 0 [45]: N-1 suitable relative energies are determined by 
N-1 combinations of the constraints (so that the conjugate N-1 relative times are gauge variables), 
while the remaining combination determines the 2^ branches of the mass spectrum. The N gauge 
freedoms associated with these N combinations of the first class constraints are the freedom of the 
observer: i) in the choice of the time parameter to be used for the overall evolution of the isolated 
system; ii) in the choice of the description of the relative motions with any given delay among the 
pairs of particles. 

In Ref. [46] 2- and N-body Newton mechanics was reformulated in a multitemporal way in 
terms of N first class constraints obtained from the relativistic ones in the limit c oo. After 
a comparison with predictive mechanics, it was shown that the “no-interaction- theorem” (namely 
that the multitemporal configurational and canonical position coordinates of a particle coincide 
only in absence of interactions) exists also at the nonrelativistic level, being a property of the 
multitemporal description of particles and not of the kinematical symmetry group. 

2) Both the open and closed Nambu string, after an initial study with light-cone coordinates, 
have been treated[47] along the lines of the two-body model in the stratum > 0. Both Abelian 
Lorentz scalar constraints and gauge variables have been found and globally decoupled, and a 
redundant set of Dirac’s observables [z, fc, Cn] has been found. It remains an open problem whether 
one can extract a global canonical basis of Dirac’s observables from the Wigner spin 1 vectors 
Un, which satisfy sigma-model-like constraints; if this basis exists, it would define the Liouville 
integrability of the Nambu string and would clarify whether there is any way to quantize it in four 
dimensions. 

3) Yang-Mills theory with Grassmann- valued fermion fields [48] in the case of a trivial prin- 
cipal bundle over a fixed-a;'^ slice of Minkowski spacetime with suitable Hamiltonian-oriented 
boundary conditions; this excludes monopole solutions (to have them, even if they have been not 
yet found experimentally, one needs a nontrivial bundle and a variational principle formulated on 
the bundle[49], because the gauge potentials on Minkowski spacetime are not globally defined) 
and, since is not compactified, one has only winding number and no instanton number. After 
a discussion of the Hamiltonian formulation of Yang-Mills theory, of its group of gauge transfor- 
mations and of the Gribov ambiguity, the theory has been studied in suitable weighted Sobolev 
spaces where the Gribov ambiguity is absent [50] and the global color charges are well defined. The 
global Dirac observables are the transverse quantities Aa±{x,x^)y Ea±{XyX^) and fermion fields 
dressed with Yang-Mills (gluonic) clouds. The nonlocal and nonpolynomial (due to the presence of 
classical Wilson lines along flat geodesics) physical Hamiltonian has been obtained: it is nonlocal 
but without any kind of singularities, it has the correct Abelian limit if the structure constants are 
turned off, and it contains the explicit realization of the abstract Mitter-Viallet metric. 

4) The Abelian and non- Abelian SU(2) Higgs models with fermion fields[51], where the sym- 
plectic decoupling is a refinement of the concept of unitary gauge. There is an ambiguity in the 
solutions of the Gauss law constraints, which reflects the existence of disjoint sectors of solutions 
of the Euler-Lagrange equations of Higgs models. The physical Hamiltonian and Lagrangian of the 
Higgs phase have been found; the self-energy turns out to be local and contains a local four-fermion 



498 


Luca Lusanna 


interaction. 

5) The standard SU(3)xSU(2)xU(l) model of elementary particles[52] with 
Grassmann-valued fermion fields. The final reduced Hamiltonian contains nonlocal self-energies 
for the electromagnetic and color interactions, but “local ones” for the weak interactions implying 
the nonperturbative emergence of 4-fermions interactions. 

F) When a good description of the system in terms of Dirac observables exists, one is going 
to face the problem of quantizing only the true physical degrees of freedom, which generically are 
nonlinear and nonlocal functions or functionals of the original variables. When a quantization is 
possible, there is a high probability to get a quantum theory inequivalent to that obtained by first 
quantizing the original variables and then making the reduction to the physical degrees of freedom 
at the quantum level (see for instance the BRS method) . 

With regards to field theory, this method has the drawback that generically the physical Hamil- 
tonian, and therefore also the Lagrangian, is non polynomial in the physical degrees of freedom. 
Power counting methods cannot be used when looking for regularizations and renormalizations of 
the theory, and the advantages of a global control of the dynamics of physical quantities and of 
the possibility to check whether a model is classically well defined are destroyed by our present 
inhability to solve these problems. The question, which puzzled both Dirac and Yukawa, reap- 
pears, whether it is possible to define an intrinsic ultraviolet cutoff and a regularization scheme 
independent from the power counting. 


2 The Separation of the Center of Mciss in Special Rel- 
ativity, the Rest-Frame Instant Form of Dynamics and 
Wigner-Covariant Generalized Coulomb Gauges. 

The next problem is how to covariantize these results valid in Minkowski spacetime with Cartesian 
coordinates. Again the starting point was given by Dirac[l] with his reformulation of classical field 
theory on spacelike hypersurfaces foliating Minkowski spacetime [the foliation is defined by an 
embedding i? x S (r,?) G Sr? with S an abstract 3-surface diffeomorphic to 

with Sr its copy embedded in labelled by the value r (the Minkowski flat indices are (^); 

the scalar “time'* parameter r labels the leav^ of the foliation, a are curvilinear coordinates on Sr 
and (r,a) are Sr-adapted holonomic coordinates for M^); this is the classical basis of Tomonaga- 
Schwinger quantum field theory]- In this way one gets a parametrized field theory with a covariant 
3+1 splitting of Minkowski spacetime and already in a form suited to the transition to general rela- 
tivity in its ADM canonical formulation (see also Ref.[53], where a theoretical study of this problem 
is done in curved spacetimes). The price is that one has to add as new independent configuration 
variables the embedding coordinates z^^) (r, u) of the points of the spacelike hypersurface Er [the 
only ones carrying Lorentz indices] and then to define the fields on Er so that they know the hyper- 
surface Et of T-simultaneity [for a Klein-Gordon field <p{x), this new field is 0(r, a) = (I){z{t,S)): 
it contains the nonlocal information about the embedding]. Then one rewrites the Lagrangian of 
the given isolated system in the form required by the coupling to an external gravitational field, 
mak^ the previous 3+1 splitting of Minkowski spacetime and interpretes all the fields of the sys- 
tem as the new fields on E^- (they are Lorentz scalars, having only surface indices). Instead of 
considering the 4-metric as describing a gravitational field (and therefore as an independent field 
as it is done in metric gravity, where one adds the Hilbert action to the action for the matter 
fields), here one replaces the 4-metric with the the induced metric gAsi^] = on E^- 

[a functional of here we use the notation = (r, a^); = dz^^^ /da^ are flat tetrad fields 

on Minkowski spacetime with the z^^^’s tangent to E^] and considers the embedding coordinates 
as independent fields [this is not possible in metric gravity, because in curved spacetimes 
z^ dz^/da^ are not tetrad fields so that holonomic coordinates z^{r,a) do not exist]. Prom 
this Lagrangian, besides a Lorentz-scalar form of the constraints of the given system, we get four 
extra primary first class constraints 



Dirac-Bergmaxm Constrained Dynamics 


499 


- ^(p)(r,a)r;;,(r,a) - Zr(;x) (r, ( t)!;- ( r, a) « 0 

[here a), TJyg{T,cr)^ are the components of the energy-momentum tensor in the holonomic 

coordinate system, corresponding to the energy- and momentum-density of the isolated system; 
one has (r, cr), (r, cr )} = 0 ] implying the independence of the description from the choice 
of the 3+1 splitting, i.e. from the choice of the foliation with spacelike hypersufaces. The evo- 
lution vector is given by » where is the normal to 

Ttr in (?) and lapse and shift functions defined 

through the metric like in general relativity: nowever, now they are not independent variables but 
functionals of z^^\r^a). 

The Dirac Hamiltonian contains the piece / (r, (j)?^(^) (r, o') with A(^l(r,(?) Dirac mul- 

tipliers. It is possible to rewrite the integrand in the form ^ is the inverse of ^ 7 . 5 ] 

(t, ^)-N(flat)rir, o){^ (r, S) 

with the (nonholonomic form of the) constraints {l{n,)'HS^^){r,e) 0 , 

(r, a) ^ 0, satisfying the universal Dirac algebra of the ADM constraints. In this way we have 
defined new flat lapse and shift functions 


Nuiat)iT,a) = A(^)(r,a)Z('^)(T,a), 
N(fiat)T (d S) = A(m) (t, 5')+^ (r, a) . 


which have the same content of the arbitrary Dirac multipliers A(^)(r,?), namely they multi- 
ply primary first class constraints satisfying the Dirac algebra. In 

Minkowski spacetime they are quite distinct from the previous lapse and shift functions N^z]{fiat)'> 
^[z]{fiat)ri defined starting from the metric. Instead in general rela,tivity the lapse and shift func- 
tions defined starting from the 4-metric are the coeflicients (in the canonical part He of the Hamil- 
tonian) of secondary first class constraints satisfying the Dirac algebra. 

In special relativity, it is convenient to restrict ourselves to arbitrary spacelike hyperplanes 
z(>^\T,d) = + 6 ^^^(r)cr''. Since they are described by only 10 variables, after this restric- 

tion we remain only with 10 first class constraints determining the 10 variables conjugate to the 
hyperplane in terms of the variables of the system: 






0 . 


After the restriction to spacelike hyperplanes the previous piece of the Dirsic Hamiltonian is re- 
duced to A('')(T)^(^)(r) - |A('‘)('')(r) 7 i(^)(^)(r). Since at this stage we have 
so that +^(r,a) w N[^](fiai){T,^)l^''HD^) + = 

-A('^)(t) - A('‘)W(r) 6 r(K)(T)<j’', it is only now that we get the coincidence of the two definitions 
of fiat lapse and shift functions (this point was missed in the older treatments of parametrized 
Minkowski theories): 


N[z]uia.t)r{j,3) » Nf^fiat){r,5) = -\^){T)b’f\r) - b^/^{T)X(yC){v){r)b^s^{r)a\ 

The 20 variables for the phase space description of a hyperplane are: 

i) parametrizing the origin of the coordinates on the family of spacelike hyperplanes. 

The four constraints « 0 say that pi"^ is determined by the 4^momentum of the isolated 

system. 

ii) 6 ^^(r) (with the 6 |.'‘^(t)’s being three orthogonal spacelike unit vectors generating the fixed 



500 


Luca Lusanna 


T-independent timelike unit normal 6r''^ = to the hyperplanes) and with 

the orthonormality constraints 6^^ \u.){p)b^B = [enforced by assuming the Dirac brackets 
jW} ^ with cgf.f 

the structure constants of the Lorentz algebra]. In these variables there are hidden six independent 
pairs of degrees of freedom. The six constraints « 0 say that coincides the spin 

tensor of the isolated system. Then one has that p[^\ + satisfy 

the algebra of the Poincare group. 

Let us remark that, for each configuration of an isolated system there is a privileged family of 
hyperplanes (the Wigner hyperplanes orthogonal to pi ^^ , existing when p^ > 0) corresponding to 
the intrinsic rest-frame of the isolated system. If we choose these hyperplanes with suitable gauge 
fixings, we remain with only the four constraints 'HS^\r) « 0, which can be rewritten as 

« ^nvariantmass of the isolated system under investigation] = Mgys] 

Pgy^ = [3 - marnentum of the isolated system insidethe Wigner hyperplane] ^ 0. 

There is no more a restriction on ps^\ because Us^^{ps) = /p^ gives the orientation of the 

Wigner hyperplanes containing the isolated system with respect to an arbitrary given external 
observer. 

In this special gauge we have 6^^ = a(PsjPs) (the standard Wigner boost for timelike 
Poincare orbits), and the only remaining canonical variables are the noncovariant 

Newton-Wigner-like canonical “external” center-of-mass coordinate (r) (living on the Wigner 
hyperplanes) and . Now 3 degrees of freedom of the isolated system [an “internal” center-of- 
mass 3-variable Csys defined inside the Wigner hyperplane and conjugate to psys] become gauge 
variables [the natural gauge fixing is dsys « 0, so that it coincides with the origin xi^\r) = 

(r, <? = 0) of the Wigner hyperplane], while the is playing the role of a kinematical external 
center of mass for the isolated system and may be interpreted as a decoupled observer with his 
parametrized clock (point particle clock). All the fields living on the Wigner hyperplane are now 
either Lorentz scalar or with their 3-indices transformaing under Wigner rotations (induced by 
Lorentz transformations in Minkowski spacetime) as any Wigner spin 1 index. 

One obtains in this way a new kind of instant form of the dynamics (see Ref. [54]), the “Wigner- 
covariant 1-time rest-frame instant form” [55] with a universal breaking of Lorentz covariance. It 
is the special relativistic generalization of the nonrelativistic separation of the center of mass from 
the relative motion [H = + H'rei]* The role of the center of mass is taken by the Wigner 

hyperplane, identified by the point (r) and by its normal The invariant mass Mgys of the 
system replaces the nonrelativistic Hamiltonian Hrei for the relative degrees of freedom, after the 
addition of the gauge-fixing ~ r « 0 [identifying the time parameter r, labelling the leaves of the 
foliation, with the Lorentz scalar time of the center of mass in the rest frame, Tg = Ps ' Xg/Mgys] 
Mays generates the evolution in this time]. 

The determination of Says may be done with the group theoretical methods of Ref. [56]: given 
a realization on the phase space of a given system of the ten Poincare generators one can build 
three 3-position variables only in terms of them, which in our case of a system on the Wigner 
hyperplane with pays « 0 are: i) a canonical center of mass (the “internal” center of mass a ays)] 
ii) a noncanonical M0ller center of energy ffsys] hi) a noncanonical Pokker-Pryce center of inertia 
dsyP- Due to Pays « 0, we have days ^ ^syJ ^ diyP . By adding the gauge fixings Says « 0 
one can show that the origin becomes simultaneously the Dixon center of mass of an 

extended object and both the Pirani and Tulczyjew centroids (see Ref. [57] for the application 
of these methods to find the center of mass of a configuration of the Klein-Go'rdon field after the 
preliminary work of Ref. [58]). With similar methods one can construct three “external” collective 
positions (all located on the Wigner hyperplane): i) the “external” canonical noncovariant center 
of mass ii) the “external” noncanonical and noncovariant Mpller center of energy hi) the 
“external” covariant noncanonical Fokker-Pryce center of inertia (when there are the gauge 



Dirac-Bergmann Constrained Dynamics 


501 


fixings asys ^ 0 it also coincides with the origin It turns out that the Wigner hyperplane 
is the natural setting for the study of the Dixon multipoles of extended relativistic systems [59] 
and for defining the canonical relative variables with respect to the center of mass. After having 
put control on the relativistic definitions of center of mass of an extended system, the lacking 
kinematics of relativistic rotations in now under investigation. The Wigner hyperplane with its 
natural Euclidean metric structure offers a natural solution to the problem of boost for lattice 
gauge theories and realizes explicitly the machian aspect of dynamics that only relative motions 
are relevant. 

The isolated systems till now analyzed to get their rest-frame Wigner-covariant generalized 
Coulomb gauges [i.e. the subset of global Shanmugadhasan canonical bases, which, for each 
Poincare stratum, are also adapted to the geometry of the corresponding Poincare orbits with 
their little groups; these special bases can be named Poincar^Shanmugadhasan bases for the given 
Poincare stratum of the presymplectic constraint manifold (every stratum requires an indepen- 
dent canonical reduction); till now only the main stratum with timelike and ^ 0 has been 
investigated] are: 

a) The system of N scalar particles with Grassmann electric charges plus the electromagnetic 
field [55]. The starting configuration variables are a 3-vector fji{r) for each particle [x\^\r) = 

(T,fii{T))] and the electromagnetic gauge potentials AAiT,a) - (z(t, g)), which 

know the embedding of Et into M^. One has to choose the sign of the energy of each particle, 
because there are not mass-shell constraints (like — m\ « 0) among the constraints of this 
formulation, due to the fact that one has only three degrees of freedom for particle, determining 
the intersection of a timelike trajectory and of the spacelike hypersurface Sr- For each choice of 
the sign of the energy of the N particles, one describes only one of the 2^ branches of the mass 
spectrum of the manifestly covariant approach based on the coordinates a:-^^(r), p\^\t)^ i=l,..,N, 
and on the constraints Pi — « 0 (in the free case). In this way, one gets a description of 

relativistic particles with a given sign of the energy with consistent couplings to fields and valid 
independently from the quantum effect of pair production [in the manifestly covariant approach, 
containing all possible branches of the particle mass spectrum, the classical counterpart of pair pro- 
duction is the intersection of different branches deformed by the presence of interactions]. The final 
Dirac’s observables are: i) the transverse radiation field variables A±, JSjl; ii) the particle canon- 
ical variables 7ft(r), «i(r), dressed with a Coulomb cloud. The physical Hamiltonian contains the 
mutual instantaneous Coulomb potentials extracted from field theory and there is a regularization 
of the Coulomb self-energies due to the Grassmann character of the electric charges Qi [Q^ = 0]. 
In Ref. [60] there is the study of the Lienard-Wiechert potentials and of Abraham-Lorentz-Dirac 
equations in this rest-frame Coulomb gauge and also scalar electrodynamics is reformulated in it. 
Also the rest-frame 1-time relativistic statistical mechanics has been developed [55]. 

b) The system of N scalar particles with Grassmami-valued color charges plus the color SU(3) 
Yang-Mills field[61]: it gives the pseudoclassical description of the relativistic scalar-quark model, 
deduced from the classical QCD Lagrangian and with the color field present. The physical invariant 
mass of the system is given in terms of the Dirao observables. Prom the reduced Hamilton equations 
the second order equations of motion both for the reduced transverse color field and the particles 
are extracted. Then, one studies the N=2 (meson) case. A special form of the requirement of 
having only color singlets, suited for a field-independent quark model, produces a “pseudoclassical 
asymptotic freedom” and a regularization of the quark self-energy. With these results one can 
covariantize the bosonic part of the standard model given in Ref. [52]. 

c) The system of N spinning particles of definite energy [(|,0) or (0, |) representation of 
SL(2,C)] with Grassmann electric charges plus the electromagnetic field[62] and that of a Grassmarm- 
valued Dirac field plus the electromagnetic field (the pseudoclassical basis of QED) [63]. In both 
cases there are geometrical complications connected with the spacetime description of the path of 
electric currents and not only of their spin structure, suggesting a reinterpretation of the supersym- 
metric scalar multiplet as a spin fibration with the Dirac field in the fiber and the Klein-Gordon 
field in the base; a new canonical decomposition of the Klein-Gordon field into center-of-mass and 
relative variables [58, 57] will be helpful to clarify these problems. After their solution and after 



502 


Luca Lusanna 


having obtained the description of Grassmann-valued chiral fields [this will require the transcrip- 
tion of the front form of the dynamics in the instant one for the Poincare strata with = 0] the 
rest-frame form of the full standard 5f7(3) x SU{2) x U{1) model can be achieved. 

The rest-frame description of the relativistic perfect gas is now under investigation. 

All these new pieces of information will allow, after quantization of this new consistent rel- 
ativistic mechanics without the classical problems connected with pair production, to find the 
asymptotic states of the covariant Tomonaga-Schwinger formulation of quantum field theory on 
spacelike hypersurfaces (to be obtained by quantizing the fields on S^): these states are needed for 
the theory of quantum bound states [since Fock states do not constitute a Cauchy problem for the 
field equations, because an in (or out) particle can be in the absolute future of another one due to 
the tensor product nature of these asymptotic states, bound state equations like the Bethe-Salpeter 
one have spurious solutions which are excitations in relative energies, the variables conjugate to 
relative times]. Moreover, it will be possible to include bound states among the asymptotic states. 

As said in Ref. [60, 61], the quantization of these rest-frame models has to overcome two prob- 
lems. On the particle side, the complication is the quantization of the square roots associated with 
the relativistic kinetic energy terms: in the free case this has been done in Ref. [64] [see Refs. [65] for 
the complications induced by the Coulomb potential]. On the field side (all physical Hamiltonian 
are nonlocal and, with the exception of the Abelian case, nonp’olynomial, but quadratic in the 
momenta), the obstacle is the absence (notwithstanding there is no no-go theorem) of a complete 
regularization and renormalization procedure of electrodynamics (to start with) in the Coulomb 
gauge: see Ref.[66] (and its bibliography) for the existing results for QED. 

However, as shown in Refs. [55, 48], the rest-frame instant form of dynamics automatically 
gives a physical ultraviolet cutoff in the spirit of Dirac and Yukawa: it ‘s the M0ller radius[67] 
p z= = \S\/^/^ (PF^ = —p^S^ is the Pauli-Lubanski Casimir when > 0), namely the 

classical intrinsic radius of the worldtube, around the covariant noncanonical Fokker-Pryce center 
of inertia inside which the noncovariance of the canonical center of mass is concentrated. 
At the quant um level p becomes t he Comp ton wavel ength of the isolated system multiplied its spin 
eigenvalue ^5(7+1) , p p = >/s(s + l)h/M = y/s(s -\- 1)Am with M = the invariant mass 
and Am = h/M its Compton wavelength. Therefore, the criticism to classical relativistic physics, 
based on quantum pair production, concerns the testing of distances where, due to the Lorentz 
signature of spacetime, one has intrinsic classical covariance problems: it is impossible to localize 
the canonical center of mass adapted to the first class constraints of the system (also named 
Pryce center of mass and having the same covariance of the Newton- Wigner position operator) in 
a frame independent way. 

Let us remember [55] that p is also a remnant in flat Minkowski spacetime of the energy con- 
ditions of general relativity: since the Mpller noncanonical, noncovariant center of energy R^^^has 
its noncovariance localized inside the same worldtube with radius p (it was discovered in this way) 
[67], it turns out that for an extended relativistic system with the material radius smaller of its 
intrinsic radius p one has: i) its peripheral rotation velocity can exceed the velocity of light; ii) its 
classical energy density cannot be positive definite everywhere in every frame. 

Now, the real relevant point is that this ultraviolet cutoff determined by p exists also in Ein- 
stein’s general relativity (which is not power counting renormalizable) in the case of asymptotically 
flat spacetimes, taking into account the Poincare Casimirs of its asymptotic ADM Poincare charges 
(when supertranslations are eliminated with suitable boundary conditions). The generalization of 
the worldtube of radius p to asymptotically flat general relativity with matter, could also be con- 
nected with the unproved cosmic censorship hypothesis. 

Moreover, the extended Heisenberg relations of string theory[68], i.e. Ax = ^ ^ = 

^ 4. implying the lower bound Ax > L-cs = yJ^/Tcs due to the 2/ + l/y structure, have 
a counterpart in the quantization of the Mpller radius[55]: if we ask that, also at the quantum 
level, one cannot test the inside of the worldtube, we must ask Arc > p which is the lower bound 
implied by the modified uncertainty relation Arc = -h This could imply that the center- 

of-mass canonical noncovariant 3-coordinate f = y/P^{x - -^x^) [55] cannot become a self-adjoint 
operator. See Hegerfeldt’s theorems (quoted in Refs. [48, 55]) and his interpretation pointing at 



Dirac-Bergmann Constrained Dynamics 


503 


the impossibility of a good localization of relativistic particles (experimentally one determines only 
a worldtube in spacetime emerging from the interaction region). Since the eigenfunctions of the 
canonical center-of-mass operator are playing the role of the wave function of the universe, one 
could also say that the center-of-mass variable has not to be quantized, because it lies on the 
classical macroscopic side of Copenhagen’s interpretation and, moreover, because, in the spirit 
of Mach s principle that only relative motions can be observed, no one can observe it (it is only 
used to define a decoupled “point particle clock”). On the other hand, if one rejects the canonical 
noncovariant center of mass in favor of the covariant noncanonical Fokker-Pryce center of inertia 
y/i, Y^} ^ 0, one could invoke the philosophy of quantum groups to quantize to get some 
kind of quantum plane for the center-of-mass description. Let us remark that the quantization of 
the square root Hamiltonian done in Ref. [64] is consistent with this problematic. 

In conclusion, the best set of canonical coordinates adapted to the constraints and to the 
geometry of P oincare orbits in Minkowski spacetime and naturally predisposed to the coupling to 
canonical tetrad gravity is emerging for the electromagnetic, weak and strong interactions with 
matter described either by fermion fields or by relativistic particles with a definite sign of the 
energy. 

3 Tetrad Gravity, Physical Hamiltonian Degrees of Freedom 
of the Gravitational Field and the Deparametrization of 
General Relativity. 

Tetrad gravity is the formulation of general relativity natural for the coupling to the fermion 
fields of the standard model. However, we need a formulation of it, which allows to solve its 
constraints for doing the canonical reduction and to solve the deparametrization problem of general 
relativity (how to recover the rest-frame instant form when the Newton constant is put equal to 
zero, G=0). Since neither a complete reduction of gravity with an identification of the physical 
canonical degrees of freedom of the gravitational field nor a detailed study of its Hamiltonian group 
of gauge transformations (whose infinitesimal generators are the first class constraints) has ever 
been pushed till the end in an explicit way, a new formulation of tetrad gravity [69, 70, 71, 72] was 
developed. 

To implement this program we shall restrict ourselves to the simplest class of spacetimes [time- 

oriented pseudo-Riemannian or Lorentzian 4-manifold (M^,'^p) with signature e(4 ) (€ = ±1 

according to either particle physics or general relativity convention) and with a choice of time 
orientation], assumed to be: 

i) Globally hyperbolic 4-manifolds, i.e. topologically they are = R x E, so to have a 

well posed Cauchy problem [with E the abstract model of Cauchy surface] at least till when no 
singularity develops in [see the singularity theorems]. Therefore, these spacetimes admit regular 
foliations with orientable, complete, non-intersecting spacelike 3-manifolds E^ [t : > R, 

r(z^), is a global timelike future-oriented function labelling the leaves (surfaces of simultaneity)]. 
In this way, one obtains 3+1 splittings of and the possibility of a Hamiltonian formulation. 

ii) Asymptotically flat at spatial infinity, so to have the possibility to define as)niiptotic Poincare 
charges [73, 74, 75]: they allow the definition of a Mpller radius also in general relativity and 
are a bridge towards a future soldering with the theory of elementary particles in Minkowski 
spacetime defined as irreducible representation of its kinematical, globally implemented Poincare 
group according to Wigner. This excludes Einstein- Wheeler closed universes without boundaries 
(no asymptotic Poincare charges), which were introduced to eliminate boundary conditions at 
spatial infinity to make the theory as machian as possible. 

iii) Admitting a spinor (or spin) structure[76] for the coupling to fermion fields. Since we 
consider noncompact space- and time-orient able spacetimes, spinors can be defined if and only if 
they are “parallelizable” [77], like in our case. This impHes that the orthonormal frame principal 
SO(3)-bundle over Er (whose connections are the spin connections determined by the cotriads) is 
trivial. 



504 


Luca Lusanna 


iv) The noncompact parallelizable simultaneity 3-manifolds (the Cauchy surfaces) Er are as- 
sumed to be topologically trivial, geodesically complete and, finally, diffeomorphic to R^. These 
3-manifolds have the same manifold structure as Euclidean spaces: a) the geodesic exponential 
map Expp : TpEr E^ is a diffeomorphism ; b) the sectional curvature is less or equal zero 
everywhere; c) they have no “conjugate locus” [i.e. there are no pairs of conjugate Jacobi points 
(intersection points of distinct geodesics through them) on any geodesic] and no “cut locus” [i.e. 
no closed geodesics through any point]. 

v) Like in Yang-Mills case [48], the 3-spin-connection on the orthogonal frame SO(3)-bundle 
(and therefore cotriads) will have to be restricted to suited weighted Sobolev spaces to avoid Gribov 
ambiguities [48, 78]. In turn, this implies the absence of isometries of the noncompact Riemannian 
3-manifold (Er,^^') [see for instance the review paper in Ref. [79]]. 

Diffeomorphisms on Er {DiffEr) are interpreted in the passive way, following Ref. [80], in 
accord with the Hamiltonian point of view that infinitesimal diffeomorphisms are generated by 
taking the Poisson bracket with the 1st class supermomentum constraints [passive diffeomorphisms 
are also named ‘pseudodiffeomorphisms’]. 

The new formulation of tetrad gravity [see Refs. [81] for the existing versions of the theory] 
utilizes the ADM action of metric gravity with the 4-metric expressed in terms of arbitrary cote- 
trads. Let us remark that both in the ADM metric and tetrad formulation one has to introduce 
the extra ingredient of the 3-1-1 splittings of with foliations whose leaves E^ are spacelike 
3-hypersurfaces. However, their points [(r,ff) are Er-adapted holonomic coordinates of 

M^] are not configurational variables of these theories in contrast to what happens in Minkowski 
parametrized theories as already said [^^ are not tetrads when is not Minkowski spacetime 
with Cartesian coordinates, because ^ (*')]. 

By using Sr-adapted holonomic coordinates for one has found a new 

parametrization of arbitrary tetrads and cotetrads on in terms of cotriads on Er pe(o)r('r, a)], 
of lapse [iV(T, a)] and shift [^^(^^(r,?) = {^e(a)r-^^}('r 5 ^)] ft^ctions and of 3 parameters [v5(a)('r, cr)] 
parametrizing point-dependent Wigner boosts for timelike Poincare orbits. Putting these variables 
in the ADM action for metric gravity [73] (with the 3-metric on Sr expressed in terms of cotriads: 

= ^e(^)r ^e(a )5 with positive signature), one gets a new action depending only on lapse, shifts 
and cotriads, but not on the boost parameters (therefore, there is no need to use Schwinger’s 
time gauge). There are 10 primary and 4 secondary first class constraints and a weakly vanish- 
ing canonical Hamiltonian containing the secondary constraints like in ADM metric gravity [73]. 
Besides the 3 constraints associated with the vanishing Lorentz boost momenta (Abelianization of 
boosts), there are 4 constraints saying that the momenta associated with lapse and shifts vanish, 
3 constraints describing rotations, 3 constraints generating space-diffeomorphisms on the cotriads 
induced by those {Diff Sr) on Er (a linear combination of supermomentum constraints and of the 
rotation ones;a different combination of these constraints generates SO (3) Gauss law constraints 
for the momenta conjugated to cotriads with the covariant derivative built with the spin con- 
nection) and one superhamiltonian constraint. The six constraints connected with Lorentz boosts 
and rotations replace the constraints satisfying the Lorentz algebra in the older formulations. The 
boost parameters ip^a) (“^j smd the three angles a(a) (r, 3) hidden in the cotriads are the extra vari- 
ables of tetrad gravity with respect to metric gravity: they allow a Hamiltonian description of the 
congruences of timelike accelerated observers used in the formulation of gravitomagnetism[82, 83]. 

It turns out that with the technology developed for Yang-Mills theory, one can Abelianize the 3 
rotation constraints and then also the space-diffeomorphism constraints so that we can arrive at a 
total of 13 Abelianized first class constraints. In the Abelianization of the rotation constraints one 
needs the Green function of the 3-dimensional covaxiant derivative containing the spin connection, 
well defined only if there is no Gribov ambiguity in the SO(3)-frame bundle and no isometry of 
the Riemannian 3-manifold (E^,^^). The Green function is similar to the Yang-Mills one for a 
principal SO(3)-bundle [48], but, instead of the Dirac distribution for the Green function of the flat 
divergence, it contains the Synge-DeWitt bitensor [84] defining the tangent in one endpoint of the 
geodesic arc connecting two points (which reduces to the Dirac distribution only locally. in normal 
coordinates). Moreover, the definition of the Green function now requires the geodesic exponential 



Dirac-Bergmann Constrained Dynamics 


505 


map. 

In the resulting quasi-Shanmugadhasan canonical basis, the original cotriad can be expressed 
in closed form in terms of 3 rotation angles, 3 diffeomorphism-parameters and a reduced cotriad 
depending only on 3 independent variables (they are Dirac’s observables with respect to 13 of the 
14 first class constraints) and with their conjugate momenta, still subject to the reduced form of 
the superhamiltonian constrain: this is the phase space over the superspace of 3-geometries[85]. 

Till now no coordinate condition [86] has been imposed. It turns out that these conditions are 
hidden in the choice of how to parametrize the reduced cotriads in terms of three independent 
functions. The simplest parametrization (the only one studied till now) corresponds to choose 
a system of global 3-orthogonal coordinates on Sr, in which the 3-metric is diagonal. With a 
further canonical transformation on the reduced cotriads and conjugate momenta, one arrives 
at a canonical basis containing the conformal factor 0 (t, d^) = of the 3-geometry and its 

conjugate momentum p(t, a) plus two other pairs of conjugate canonical variables rg (r, a) , 7r5(r, a), 
d = 1,2. The reduced superhamiltonian constraint, expressed in terms of these variables, turns 
out to be an integro-differential equation for the conformal factor (reduced Lichnerowicz equation) 
whose conjugate momentum is, therefore, the last gauge variable. If we replace the gauge fixing 
of the Lichnerowicz [87] and York[88, 89, 83] approach [namely the vanishing of the trace of the 
extrinsic curvature of E^-, ^K{r^a) « 0^ also named the internal extrinsic York time[90]] with 
the natural one p(r, a) 0 and we go to Dirac brackets, we find that ra{r,a), 'irair.a) are the 
canonical basis for the physical degrees of freedom or Dirac’s observables of the gravitational field 
in the 3-orthogonal gauges. Let us remark that the functional form of the non-tensorial objects 
Ta, TTa, depends on the chosen coordinate condition. 

The next step is to find the physical Hamiltonian for them and to solve the deparametrization 
problem. If we wish to arrive at the soldering of tetrad gravity with matter and parametrized 
Minkowski formulation for the same matter, we must require that the lapse and shift functions 
of tetrad gravity [which must grow linearly in a, in suitable asymptotic Minkowski coordinates, 
according to the existing literature on asymptotic Poincare charges at spatial infinity [74]] must 
agree asymptotically with the flat lapse and shift functions, which, however, are unambigously 
defined only on Minkowski spacelike hyperplanes as we have seen. 

In metric ADM gravity the canonical Hamiltonian is H(^c)adm = f -I- Wr?i’‘](r,<j) 

0, where « 0 and « 0 are the superhamiltonian and supermomentum con- 

straints. It is differentiable and finite only for suitable N{r,a) = n(r,a) Nr{r,a) = 
(r, a) 0 defined by Beig and 6’Murchadha[74] in suitable asymptotic coordinate systems. 

For more general lapse and shift functions one must add a surface term [85] to H{^c)adm^ which 
contains the ‘‘strong” Poincare charges [73] Padm^ ^adm conserved and gauge invariant 

surface integrals]. To have well defined as 5 anptotic Poincare charges at spatial infinity[735 74] one 
needs: i) the selection of a class of coordinates systems for Sr asymptotic to flat coordinates; ii) 
the choice of a class of Hamiltonian boundary , conditions for the fields in these coordinate systems 
[all the fields must belong to some functional space of the type of the weighted Sobolev spaces]; 
iii) a definition of the Hamiltonian group Q of gauge transformations (and in particular of proper 
gauge transformations) with a well defined limit at spatial infinity so to respect i) and ii). The 
scheme is the same needed to define the non- Abelian charges in Yang-Mills theory [48]. The delicate 
point is to be able to exclude supertranslations [76], because the presence of these extra asymptotic 
charges leads to the replacement of the asymptotic Poincare group with the infinite-dimensional spi 
group [75] of asymptotic symmetries, which does not allow the definition of the Poincare spin due 
to the absence of the Pauli-Lubanski Casimir. This can be done with suitable boundary conditions 
(in particular all the fields and gauge transformations must have direction independent limits at 
spatial infinity) respecting the “parity conditions” of Beig and 6’Murchadha[74]. 

Let us then remark that in Ref. [91] and in the book in Ref.[l] (see also Ref. [74]), Dirac intro- 
duced asymptotic Minkowski rectangular coordinates 



506 


Luca, Lusanna 


in at spatial infinity Soo = Ur5^,oo For each value of r, the coordinates a;(^)(r) labels a 
point, near spatial infinity chosen as origin of Er- On it there is a flat tetrad = 


r-independent, satisfying b['^) a\i^K‘') ^U) b 




^Vab for every r [at this 


level we do not assume that is tangent to 5oo, 2ts the normal to Sr]. There will be 
transformation coefScients b^{T^a) from the holonomic adapted coordinates = (r, cr^) to 
coordinates = z^{a^) in an atlas of such that in a chart at spatial infinity one has 
zi^{T,a) = and 6^^(r,a) = (r) [for r oo one has 


and ^gAB = ^’?(/^)(>')^(oo)B ~ ]• 

Dirac[91] and, then, Regge and Teitelboim[74] proposed that the asymptotic Minkowski rect- 
angular coordinates a) = should define 10 new independent degrees 

of freedom at the spatial boundary 5oo} as it happens for Minkowski parametrized theories[55] 
when restricted to spacelike hyperplanes [defined by « Xs^\t) + br^\T)a^]] then, 10 

conjugate momenta should exist. These 20 extra variables of the Dirac proposal can be put 
in the form: pgj, b\^^^{r) [with r-independent], with Dirac 

brackets implying the orthonormality constraints ~ [so that and 

^ ■” satisfy a Poincare algebra]. In analogy with Minkowski 

parametrized theories restricted to spacelike hyperplanes, one expects to have 10 extra first class 
constraints of the type 




n(M) 


d{m) 




^*( 00 ) ” ^ADM *^( 00 ) ^ADM 

with Padm> ^a^dm related to the ADM Poincare charges P^dm^ ^adm‘ origin is 
going to play the role of a decoupled observer with his parametrized clock. 

Let us remark that if we replace and whose Poisson algebra is the direct sum of an 

Abelian algebra of translations and of a Lorentz algebra, with the new variables (with holonomic in- 
dices with respect to S^) - &foo)(p)P(^)> 

the Poisson brackets for lin.ply that P^), satisfy a Poincare algebra. 

This implies that the Poincare generators P^dm^ ^adm define in the asymptotic Dirac rect- 
angular coordinates a momentum P^dm only an ADM spin tensor [to define an 

angular momentum tensor one should find a “center of mass of the gravitational field” 

(see Ref, [58] for the Klein-Gordon case) conjugate to so that J^dm = 

^ADM^ADM ^ADM^ADM ^ ^ADMP 

The following splitting of the lapse and shift functions and the following set of boundary condi- 
tions fulfill all the previous requirements [soldering with the lapse and shift functions on Minkowski 
hyperplanes; absence of supertranslations [strictly speaking one gets P^dm ^ 0 due to the parity 
conditions; r = \a\] 


^9Tsir,a)- 

3n’-*(r,a)- 


. (1 + ^)Srs + ^hrsiT,a) = (1 + ^)5rs + 04 (r-^/^), 

,3A-*(r,ff)=03(r-5/2), 


N{T,a) = N(as){T,a) +n(T,a), n(T,a) = 

^rCr,?) = N^as}rir,ff) +nr{T,a), nriT,a) = 0(r“*), 

=> ^e(a)riT, ff) = (l + §)S(a)r + 



DiraC’Bergmann Constrained Dynamics 


507 


with hrs{T, a) — ^hrs(T, (j), -a) = — here ?) is the momentum con- 

jugate to the 3-metric ^grs[r,B) in ADM metric gravity. 

These boundary conditions identify the class of spacetimes of Christodoulou and Klainermann[92] 
(they are near to Minkowski spacetime in a norm sense, contain gravitational radiation but evade 
the singularity theorems, because they do not satisfy the hypothesis of conformal completion to 
get the possibility to put control on the large time development of the solutions of Einstein’s equa- 
tions). These spacetimes also satisfy the rest- frame condition = 0 (this requires A^r('7‘) = 0 

like for Wigner hyperplanes in parametrized Minkowski theories) and have vanishing shift functions 
(but non trivial lapse function). 

After the addition of the surface term, the resulting canonical and Dirac Hamiltonians of ADM 
metric gravity are 

H{c)ADM = f d^^l(^(as) + n)^ H- (N(as)r + ?^r) B) i-4 

^{c)ADM = / + n)'fL + (iV((is)r + fir) 

+ ~^A{T)PiDM + >^AB{T)Ji^M = 

= / d^a[nii + Ur a) + \a{t)P^j^m + 

^ ^a{t)Padm + 

with the “weak conserved improper charges” Padm'> ^adm volume integrals differing 

from the weak charges by terms proportional to integrals of the constraints]. The previous split- 
ting implies to replace the variables N{r,B), Nr{r,B) with the ones AA(r), AAB('r) = -Aba(t), 
n(r,(j), nr(r,a) [with conjugate momenta 7r^(r), #”(r,a), in the 

ADM theory. 

With these assumptions one has the following form of the line element (also its form in tetrad 
gravity is given) 

ds^ = e([iV(os) + ti]2 - [N(^as)r + nrpel^^ + ns])idT)'^- 

-2e[N(as)r +nr]dTda^ - €^e(^a)r^e(a)sdcr''dcT\ 

The final suggestion of Dirac is to modify ADM metric gravity in the following way: 

i) add the 10 new primary constraints - P^dm ^ 0, - J^dm ^ where = 

[remember that and satisfy a Poincare alge- 

bra]; 

ii) consider Aa(t), Aab('^)) as Dirac multipliers for these 10 new primary constraints, and not as 
configurational (arbitrary gauge) variables coming from the lapse and shift functions [so that there 
are no conjugate (vanishing) momenta ^"^(r), ^^•®(r) and no associated Dirac multipliers (aM, 
C^b ('^)]5 in the assumed Dirac Hamiltonian [it is finite and differentiable] 

H(d)adm = / d^(j[nH + rir'fV' + An^’^ -b A 5 ? 7 f^](T, ?)— 

-AA(r)[p^) - Padm] "" ^ab{'^)[J^) Jadm] ^ 

The reduced phase space is still the ADM one: on the ADM variables there are only the secondary 
first class constraints Hir^B) « 0, 'K^{r^3) « 0 [generators of proper gauge transformations], 
because the other first class constraints p^^ — Padm ^ ^aEm ^ ^ generate 

improper gauge transformations but eliminate 10 of the extra 20 variables. 

In this modified ADM metric gravity, one has restricted the 3+1 splittings of to folia- 
tions whose leaves Sr tend to Minkowski spacelike hyperplanes asymptotically at spatial infin- 
ity in a direction independent way. Therefore, these S,. should be determined by the 10 de- 
grees of fireedom ^(^^('T'), ^(^)a(^)’ happens for fiat spacelike hyperplanes: this means 

that it must be possible to define a “parallel transport” of the asymptotic tetrads to 

get well defined tetrads in each point of While it is not yet clear whether this can be 



508 


Luca Lusanna 


done for Xab{t) 0, there is a solution for XabIt) = 0. This case corresponds to go to the 
Wigner-like hypersurfaces [the analogue of the Minkowski Wigner hyperplanes with the asymp- 
totic normal parallel to Following the same procedure defined for Minkowski 

spacetime, one gets = Ja%m Ref.[55] for the definition of XABir) = 0 and 

-AAWbfoo)--PADM] = -^rir)[e^oo)-pADM]+MT)P^j:,M ^oo) = SO that the final form 

of these four surviving constraints is {Padm — *1 implies Padm ^ X/Iadm = y Padm ^ ^adm 
is the ADM mass of the universe) 


^( 00 ) Padm ^ Padm ^ 0 - 


On this subclass of foliations [whose leaves will be called Wigner-Sen- Witten hypersur- 

faces; they define the intrinsic asymptotic rest frame of the gravitational field] one can introduce a 
parallel transport by using the interpretation of Ref. [93] of the Witten spinorial method of demon- 
strating the positivity of the ADM energy [94]. Let us consider the Sen-Witten connection [95, 94] 
restricted to (it depends on the trace of the extrinsic curvature of and the 

spinorial Sen-Witten equation associated with it. As shown in Ref. [96], this spinorial equation 
can be rephrased as an equation whose solution determines (in a surface dependent dynamical 
way) a tetrad in each point of once it is given at spatial infinity (again this requires a 

direction independent limit). Therefore, at spatial infinity there is a privileged congruence of time- 
like observers, which replaces the concept of “fixed stars” in the study of the processional effects 
of gravitomagnetism on gyroscopes and whose connection with the definition of post-Newtonian 
coordinates has still to be explored. 


On the Wigner-Sen-Witten hypersurfaces the spatial indices have become spin-1 Wigner indices 
[they transform with Wigner rotations under asymptotic Lorentz transformations]. As said for 
parametrized theories in Minkowski spacetime, in this special gauge 3^degrees of freedom of the 
gravitational field [ an internal 3-center-of-mass variable aADMpP»^n] inside the Wigner-Sen- 
Witten hypersurface] become gauge variables, while [the canonical non covariant variable 

replacing becomes a decoupled observer with his “point particle clock” [4, 5] near spatial 
infinity. Since the positivity theorems for the ADM energy imply that one has only timelike or 
lightlike orbits of the asymptotic Poincare group, the restri ction to u niverses with timelike ADM 

4-momentum allows to define the M0ller radius pamd = \/ ~~^adm/Padm asymptotic 

Poincare Casimirs P^dm^ ^adm- 


By going from to the canonical basis T(«,) =P(oo)(/x)5(^)/e(oo) = P(oo)(^)a;(^)/e(oo), 


r=(i) 


n(0 i(o) /Jo) 


E.W - JO 


= e(oo)(xS‘i) -P(1)5 (^)/pS)). = P(L)/Hoo) = like in the flat case 

one fin^ that the final reduction requires the gauge-feings - r « 0 and ct'^adm ^ where 
<7^ = (Tad m ^ variable representing the “internal center of mass” of the 3-metric of the slice of 
the asymptotically fiat spacetime M^. Since €(oo)} = -e, with the gauge fixing r(oo) - r « 0 

one gets Ar(r) « e, and the .final Dirac Hamiltonian is Hd = Madm + K{'^)Padm Madm 
the natural physical Hamiltonian to reintroduce an evolution in the “mathematical” Tf^oo) = 

namely in the rest-frame time identified with the parameter r labelling the leaves of the 

foliation of M^. Physical times (atomic clocks, ephemeridis time...) must be put in a local 1-1 
correspondence with this “mathematical” time. This point of view excludes any Wheeler-DeWitt 
interpretation of an internal time (like the extrinsic York one or the WKB times), which is used 
in closed universes of the Einstein- Wheeler type. 


All this construction holds also in our formulation of tetrad gravity (since it uses the ADM 
action) and in its canonically reduced form in the 3-orthogonal gauges. The final physical Hamil- 
tonian of tetrad gravity for the physical gravitational field is the reduced volume form of the ADM 
energy ■PAi>iW'[^5-‘7i'a3 0(7*5j7rs)] with the conformal factor <j> solution of the reduced Lichnerowicz 
equation in the 3-orthogonal gauge with /?(r, a) ^ 0. The Hamilton-Dirac equations generated 



Dirac-’Bergmsjm Constrained Dynamics 


509 


by this Hamiltonian for rg , Wa generate the pair of second order equations in normal form for 
hidden in the Einstein equations in this particular gauge. 

Let us compare the standard generally covariant formulation of gravity based on the Hilbert 
action with its invariance under Diff with the ADM Hamiltonian formulation. 

Regarding the 10 Einstein equations of the standard approach, the Bianchi identities imply 
that four equations are linearly dependent on the other six ones and their gradients. Moreover, 
the four combinations of Einstein’s equations projectable to phase space (where they become the 
secondary first class superhamitonian and supermomentum constraints of canonical metric gravity) 
are independent from the accelerations being restrictions on the Cauchy data. As a consequence 
the Einstein equations have solutions, in which the ten components of the 4--metric depend on 
only two truly dynamical degrees of freedom (defining the physical gravitational field) and on eight 
undetermined degrees of freedom. This transition from the ten components of the tensor 
in some atlas of to the 2 (deterministic) +8 (undetermined) degrees of freedom breaks general 
covariance, because these quantities are neither tensors nor invariants under diffeomorphisms (their 
functional form is atlas dependent). 

Since the Hilbert action is invariant under Diff M^, one usually says that a “dynamical gravita- 
tional field” is a 4-geometry over M^, namely an equivalence class of spacetimes (M^,"^^), solution 
of Einstein’s equations, modulo Diff M^. See, however, the interpretational problems about what 
is observable in general relativity for instance in Refs. [7, 8], in particular the facts that at least 
before the restriction to the solutions of Einstein’s equations i) scalars under Diff like are 

not Dirac’s observables but gauge dependent quantities; ii) the functional form of in terms of 

the physical gravitational field and, therefore, the angle and distance properties of material bodies 
and the standard procedures of defining measures of length and time based on the line element 
are gauge dependent. 

Instead in the ADM formalism with the extra notion of 3+1 splittings of the (tetrad) 
metric ADM action (differing from the Hilbert one by a surface term) is quasi-invariant under 
the (14) 8 types of gauge transformations which are the pull-back of the Hamiltonian group Q of 
gauge transformations, whose generators are the first class constraints of the theory . The Hamil- 
tonian group Q has a subgroup (whose generators are the supermomentum and superhamiltonian 
constraints) formed by the diffeomorphisms of adapted to its 3+1 splittings, DiffM^"^^ [it is 
different from DiffM^]. Moreover, the Poisson algebra of the supermomentum and superhamilto- 
nian constraints reflects the embeddability in of the foliation associated with the 3+1 splitting 
[97], 

Now in tetrad gravity the interpretation of the 14 gauge transformations and of their gauge 
flxings (it is independent from the presence of matter) is the following [a tetrad in a point of Sr is 
a local observer] : 

i) the gauge fixings of the gauge boost parameters associated with the 3 boost constraints and of 
the gauge angles associated with the 3 rotation constraints are equivalent to choose the congruence 
of timelike observers to be used as a standard of non rotation; 

ii) the gauge fixings of the 3 gauge parameters associated with the passive space diffeomorphisms 
[DiffTir] change of coordinates charts] are equivalent to a fixation of 3 standards of length by 
means of a choice of a coordinate system on Sr [the measuring apparatus (the “rods”) should 
be defined in terms of Dirac’s observables for some kind of matter, after its introduction into the 
theory]; 

iii) according to constraint theory the choice of 3-coordinates on Sr induces the gauge fixings 
of the 3 shift functions [i.e. of ^goi], whose gauge nature is connected with the “conventionality 
of simultaneity” [98] [therefore, the gauge fixings are equivalent to a choice of synchrinization of 
clocks and, as a consequence, to a statement about the isotropy or anisotropy of the velocity of 
light in that gauge]; 

iv) the gauge fixing on the the momentum p(r, a) conjugate to the conformal factor of the 3-metric 

[this gauge variable is the source of the gauge dependence of 4-tensors and of the scalars under 
Diff M^, together with the gradients of the lapse and shift functions] is a nonlocal statement about 
the extrinsic curvature of the leaves Sr of the given 3+1 splitting of since the superhamiltonian 

constraint produces normal deformations of Sr [97] and, therefore, transforms a 3+1 splitting of 



510 


Luca Lusanna 


into another one (the ADM formulation is independent from the choice of the 3+1 splitting), 
this gauge fixing is equivalent to the choice of a particular 3+1 splitting; 

v) the previous gauge fixing induces the gauge fixing of the lapse function (which determines the 
packing of the leaves Sr in the chosen 3+1 splitting) and, therefore, is equivalent to the fixation 
of a standard of proper time [again “clocks” should be built with the Dirac’s observables of some 
kind of matter]. 

In the Hamiltonian formalism it is natural to define a “Hamiltonian kinematical gravitational 
field” as the equivalence class of spacetimes modulo the Hamiltonian group Q, and different mem- 
bers of the equivalence class have in general different 4-Riemann tensors [these equivalence classes 
are connected with the conformal 3-geometries of the Lichnerowicz-York approach and contain 
different gauge-related 4-geometries]. Then, a “Hamiltonian dynamical gravitational field” is de- 
fined as a Hamiltonian kinematical gravitational fields which is solution of the Hamilton-Dirac 
equations generated by the weak ADM energy P^dM' Since the Hilbert and ADM actions, even if 
they have different local symmetries and invariances, both generate the same Einstein equations, 
the equivalence classes of the “Hamiltonian dynamical gravitational fields” and of the standard 
“dynamical gravitational fields” (a 4-geometry solution of Einstein’s equations) coincide. Indeed, 
on the solutions of Einstein’s equations the gauge transformations generated by the superhamilto- 
nian constraint (normal deformations of S,-) those generated by the canonical momenta of the 
lapse and shift functions together with the Sr diffeomorphisms generated by the supermomentum 
constraints are restricted by the Jacobi equations associated to Einstein’s equations to be those 
Noether symmetries of the ADM action which are also dynamical symmetries of the Hamilton 
equations and therefore they are a subset of the spacetime diffeomorphisms DiffM^ (all of which 
are dynamical symmetries of Einstein’s equations). 

The 3-orthogonal gauges of tetrad gravity are the equivalent of the Coulomb gauge in classical 
electrodynamics (like the harmonic gauge is the equivalent of the Lorentz gauge). Only after a 
complete gauge fixing the 4-tensors and the scalars under Di// become measurable quantities 
(like the electromagnetic vector potential in the Coulomb gauge): an experimental laboratory does 
correspond by definition to a completely fixed gauge. At this stage it becomes acceptable the pro- 
posal of Komar[99] and Bergmann[80] of identifying the points of a spacetime solution 

of the Einstein’s equations in absence of matter, in a way invariant under spacetime diffeomor- 
phisms, by using four bilinears and trilinears in the Weyl tensors, scalar under Diff and called 
“individuating fields” (see also Refs.[7, 8]), which do not depend on the lapse and shift functions 
(but only on the gauge variables corresponding to the 3-coordinates on and to the momentum 
conjugate to the conformal factor of the 3-metric, so that these fields carry the information on the 
choice of the 3-coordmates and of a generalized extrinsic time), to build “physical 4-coordinates” 
(in each completely fixed gauge they depend only on the two canonical pairs of Dirac’s observables 
of the gravitational field), justifying a posteriori the standard measurement theory presented in all 
textbooks on general relativity, which presuppones the individuation of spacetime points. 

Our approach breaks the general covariance of general relativity completely by going to the 
special 3-orthogonal gauges. But this is done in a way naturally associated with theories with 
first class constraints: the global Shanmugadhasan canonical transformations (when they exist) 
correspond to privileged Darboux charts for presymplectic manifolds defined by the first class 
constraints. Therefore, the gauges identified by these canonical transformations should have a 
special (till now unexplored) role also also in generally covariant theories, in which traditionally 
one looks for observables invariant under diffeomorphisms and not for not generally covariant Dirac 
observables. 

Let us remember that Bergmann[80] made the following critique of general covariance: it would 
be desirable to restrict the group of coordinate transformations (spacetime diffeomorphisms) in 
such a way that it coxild contain an invariant subgroup describing the coordinate transforma- 
tions that change the frame of reference of an outside observer (these transformations could be 
called Lorentz transformations; see also the comments in Ref. [100] on the asymptotic behaviour 
of coordinate transformations); the remaining coordinate transformations would be like the gauge 
transformations of electromagnetism. This is what we have done. In this way “preferred’ coor- 
dinate systems will emerge (the WSW hypersurfaces with their preferred congruences of timelike 



Dir&c-Bergmsim Constisined Dynsuxdcs 


511 


observers whose 4-veIocity becomes asymptotically normal to at spatial infinity), which, 

^ sm y ergmann, are not “fiat” : while the inertial coordinates are determined experimentally 
y t e o servation of trajectories of force-free bodies, these intrinsic coordinates can be deter- 
niined only by much more elaborate experiments (probably with gyroscopes), since they depend, 
at least, on the mhomogeneities of the ambient gravitational fields. See also Ref.[101] for other 
critics to general covanance: very often to get physical results one uses preferred coordinates not 
rnerely for calculational convenience, but also for understanding (this fact has been formalized as 
the principle of restricted covariance” ) . 

Since in the 3-orthogonal gauges we have the physical canonical basis rg, TTg, it is possible, 
but only in absence of matter, to define “void spacetimes” as the equivalence class of spacetimes 
without gravitational field”, whose members in the 3>orthogonal gauges are obtained by adding 
by hand the second class constraints ra(r, a) 0, 7 ra(r,d^) 0 [one gets 0(r, (?) = 1 as the 

relevant solution of the reduced Lichnerowicz equation] and, in particular, their Poincare charges 
vanish (this corresponds to the exceptional = 0 orbit of the Poincare group and shows the 
peculiarity of these solutions with zero ADM mass). It is expected that the void spacetimes can 
be defined in a gauge-independent way by adding to the ADM action the requirement that the 
leaves Sr of the 3+1 splitting be 3-conformally flat, namely that the Cotton- York 3-conformal 
tensor vanishes. The members of this equivalence class (the extension to general relativity of the 
Galilean non inertial coordinate systems with their Newtonian inertial forces) are gauge equivalent 
to Minkowski spacetime with Cartesian coordinates and it is expected that they describe pure 
acceleration effects without physical gravitational field (no tidal effects). 

See Ref. [102] for the c oo contraction of the ADM action of metric gravity: a theory with 
26 independent fields (most of them describe inertial forces) and with general Galileo covariance 
has been obtained. This formulation of Newton gravity should be the natural nonrelativistic limit 
of Einstein’s general relativity in the framework of singular Lagrangians; however, its connection 
with the post-Newtonian approximations has still to be explored. 

If we add [72] to the tetrad ADM action the action for N scalar particles with positive energy 
in the form of Ref. [55] [where it was given on arbitrary Minkowski spacelike hypersurfaces], the 
only constraints which are modified are the superhamiltonian one, which gets a dependence on 
the matter energy density A((t, (j), and the 3 space diffeomorphism ones, which get a dependence 
on the matter momentum density Mrir.a), The canonical reduction and the determination of 
the Dirac observables can be done like in absence of matter. However, the reduced Lichnerowicz 
equation for the conformal factor of the 3-metric in the 3-orthogonal gauge and with p(r, a) 0 
acquires now an extra dependence on M(T,a) and Mrir^S). 

Since, as a preliminary result, we are interested in identifying explicitly the instantaneous 
action-at-a-distance (Newton-like and gravitomagnetic) potentials 

among particles hidden in tetrad gravity (like the Coulomb potential is hidden in the electromag- 
netic gauge potential), we shall make the strong approximation of neglecting the (tidal) effects of the 
physical gravitational field by putting Ta{r^ a) ^ 0, 7rs(r, a) « 0, even if it is not strictly consistent 
with the Hamilton-Dirac equation (extremely weak gravitational fields). If, furthermore, we de- 
velop the conformal factor a) in a formal series in the Newton constant G [^ = 1+S^i 
one can find a solution (;/!)= 1 + Gcl>i at order G (post-Minkowskian approximation) of the reduced 
Lichnerowicz equation where we put — 0. However, due to a self-energy divergence in ^ 

evaluated at the positions 75i(r) of the particles, one needs to rescale the bare masses to physical 
ones, mi ^”^(T,7fi(T))m|^^^*\ and to make a regularization of the type defined in Refs. [103]. 
Then, the regularized solution for <p can be put in the reduced form of the ADM energy, which 
becomes [Ki(r) are the particle momenta conjugate to ffi(r)] fiij = [ffi — %]/|i7t — %'|] 




^ADM 

^ .ra,irsj. 



512 


Luca Lusanna 


One sees the Newton-like and the gravitomagnetic (in the sense of York) potentials (both of them 
need regularization) at the post-Minkowskian level (order G but exact in c) emerging from the 
tetrad ADM version of Einstein general relativity when we ignore the tidal effects. For G=0 we 
recover N free scalar particles on the Wigner hyperplane in Minkowski spacetime, as required by 
deparametrization. For c oo, we get the post-Newtonian Hamiltonian 


rr — n V' ^ G ^3 




3/Cx (r ) •% (r) -SRi (r) -na {T)Kj{Tynii (r) 'r > 7 r_^ 


which is of the type of the ones implied by the results of Refs. [103, 104] [the differences are 
probably connected with the use of different coordinate systems and with the fact that one has 
essential singularities on the particle worldlines and the need of regularization]. 

The main open problems now under investigation are: i) the linearization of the theory in the 3- 
orthogonal gauges in presence of matter to find the 3-orthogonal gauge description of gravitational 
waves and to go beyond the previous instantaneous post-Minkowskian approximation at least in the 
2-body case relevant for the motion of binaries; ii) the replacement of scalar particles with spinning 
ones to identify the precessional effects (like the Lense-Thirring one) of gravitomagnetism; iii) the 
coupling to perfect fluids for the simulation of rotating stars and for the comparison with the 
post-Newtonian approximations; iv) the coupling of tetrad gravity to the electromagnetic field, to 
fermion fields and then to the standard model, trying to make to reduction to Dirac’s observables in 
all these cases and to study their post-Minkowskian approximations; v) the quantization of tetrad 
gravity in the 3-orthogonal gauge with p(r, a) « 0 (namely after a complete breaking of general 
covariance): for each perturbative (in G) solution of the reduced Lichnerowicz equation one defines 
a Schroedinger equation in r for a wave functional ^[r;ra] with the associated quantized ADM 
energy ^ Hamiltonian; no problem of physical scalar product is present, but only 

ordering problems in the Hamiltonian; moreover, one has the M0ller radius as a ultraviolet cutoff. 
Also a comparison with ‘loop quantum gravity” [105], which respects general covariance but only 
for fixed lapse and shift functions, has still to be done. 

Therefore, a well defined classical stage for a unified description of the four interactions is 
emerging, even if many aspects have only been clarified at a heuristic level so that a big effort from 
both mathematical and theoretical physicists is still needed. It will be exciting to see whether in 
the next years some reasonable quantization picture will develop from this classical framework. 


References 

[1] P.A.M.Dirac, Can.J.Math. 2, 129 (1950); "Lectures on Quantum Mechanics”, Belfer Graduate 
School of Sdence, Monographs Series (Yeshiva University, New York, N.Y., 1964). 

[2] J.L.Anderson and P.G.Bergmann, Phys.Rev. 83, 1018 (1951). 

P.G.Bergmann and J.Goldberg, Phys-Rev. 98, 531 (1955). 

[3] S.Weinberg, “The Theory of Fields” , 2 volumes (Cambridge Univ.Press, Cambridge, 1995 and 
1996). 

[4] C.J.Isham, “Canonical Quantum Gravity and the Problem of Time”, in “Integrable Sys- 
tems, Quantum Groups and Quantum Field Theories”, eds.L.A.Ibort and M.A.Rodriguez, 
Salamanca 1993 (Kluwer, London, 1993); “Conceptual and Geometrical Problems in Quan- 
tum Gravity”, in “Recent Aspects of Quantum Fields”, Schladming 1991, eds. H.Mitter and 
H.Gausterer (Springer, Berlin, 1991); “Prima Facie Questions in Quantum Gravity” and 
“Canonical Quantum Gravity and the Question of Time”, in “Canonical Gravity: ^From 
Classical to Quantum”, eds. J.Ehlers and H.Friedrich (Springer, Berlin, 1994). 



Dirac-Bergmaiin Constrained Dynamics 


513 


[5] K.Kuchar, “Time and Interpretations of Quantum Gravity”, in Proc.4th Canadian Conf. 
on General Relativity and Relativistic Astrophysics”, eds. G.Kunstatter, D. Vincent and 
J. Williams (World Scientific, Singapore, 1992). 

[6] J. Butterfield and C.J.Isham, “Space-Time and the Philosophical Challenge of Quantum Grav- 
ity”, Imperial-TP-98-99-45 (gr-qc/9903072). 

[7] J.Stachel, in “General Relativity and Gravitation”, GRll, Stockholm 1986, ed. M.A.H.Mac 
Callum (Cambridge Univ. Press, Cambridge, 1987); “The Meaning of General Covariance”, 
in “Philosophical Problems of the Internal and External Worlds”, Essays in the Philosophy of 
A.Griinbaum, eds. J.Earman, A.I.Janis, G.J.Massey and N.Rescher (Pittsburgh Univ.Press, 
Pittsburgh, 1993). 

[8] C.Rovelli, Class. Quantum Grav. 8, 297 and 317 (1991). 

[9] J.Polchinski, “String Theory”, 2 volumes (Cambridge Univ. Press, Cambridge, 1998). 

[10] L.Lusanna, “Solving Gauss’ Laws and Searching Dirac Observables for the Four Inter- 
actions”, talk at the “Second Conf. on Constrained Dynamics and Quantum Gravity”, 
S.Margherita Figure 1996, eds. V.De Alfaro, J.E.Nelson, G.Bandelloni, A.Blasi, M.Cavaglia 
and A.T.Filippov, NucLPhys. (Proc.Suppl.) BSr, 13 (1997) (HEP-TH/9702114). “Unified De- 
scription and Canonical Reduction to Dirac’s Observables of the Four Interactions”, talk at 
the Int.Workshop “New non Perturbative Methods and Quantization on the Light Cone’, Les 
Houches School 1997, eds. P.Grange, H.C.Pauli, A.Neveu, S.Pinsky and A.Werner (Springer, 
Berlin, 1998) (HEP-TH/9705154). “The Pseudoclassical Relativistic Quark Model in the Rest- 
Frame Wigner-Covariant Gauge”, talk at the Euroconference QCD97, ed. S.Narison, Mont- 
pellier 1997, NucLPhys. (Proc. SuppL) B64, 306 (1998). 

[11] T.Levi-Civita, Prace Mat.Fiz. 17, 1 (1906); T.Levi-Civita and U.Amaldi, “Lezioni di Mecca- 
nica Razionale”, Vol.II, Part 2 (Zanichelli, Bologna, 1927). 

[12] A.Lichnerowicz, C.R.Acad.Sci.Paris, Ser. A, 280, 523 (1975). W.Tulczyiew, Symposia 
Math. 14, 247 (1974). N.Woodhouse, “Geometric Quantization” (Clarendon, Oxford, 1980). 
J.Sniatycki, Ann.Inst. H.Poincare 20, 365 (1984). G.Marmo, N.Mukunda and J.Samuel, 
Riv.Nuovo Cimento 6, 1 (1983). M.J.Bergvelt and E.A.De Kerf, Physica 139A, 101 and 
125 (1986). B.A.Dubrovin, M.Giordano, G.Marmo and A.Simoni, Int.J.Mod.Phys. 8, 4055 
(1993). 

[13] M.J.Gotay, J.M.Nester and G.Hinds, J.Math.Phys. 19, 2388 (1978). M.J.Gotay and 
J.M.Nester, Ann.Inst.Henri Poincare A30, 129 (1979) and A32, 1 (1980). M.J.Gotay and 

J. 5niatycki, Commun. Math.Phys. 82, 377 (1981). M.J.Gotay, Proc.Am.Math.Soc. 84, 111 
(1982); J.Math.Phys. 27, 2051 (1986). 

[14] M.Henneaux and C.Teitelboim, “Quantization of Gauge Systems” (Princeton Univ. Press, 
Princeton, 1992). 

[15] R.Sugano, Y.Kagraoka and T.Kimura, Int.J.Mod.Phys. 7, 61 (1992). 

[16] J.Sniatycki, in “Non-Linear Partial Differential Operators and Quantization Procedures”, 
Clausthal 1981, Lecture Notes Math. 1037 (Springer, Berlin, 1983). J.Snyaticki and 
A. Weinstein, Lett. Math.Phys. 7, 155 (1983). 

[17] P.G.Bergmann, Phys.Rev. 144, 1078 (1966). K.Kuchar, J.Math.Phys. 13, 758 (1972). 

K. Komar, Phys.Rev. D18, 1881, 1887 and 3017 (1978); D19, 2908 (1979). D.Dominici, 
J.Gomis, G.Longhi and J.M.Pons, J.Math.Phys. 25, 2439 (1984). 

[18] L.Lusanna, Phys.Rep. 185, 1 (1990). 

[19] L.Lusanna, Riv. Nuovo Cimento 14, n.3, 1 (1991). 



514 


Luca Lusanna 


[20] LXusajina, J.Math.Phys, 31, 2126 (1990). 

[21] L.Lusanna, J.Math.Phys. 31, 428 (1990). 

[22] L.Lusanna, Int.J.Mod.Phys. A8, 4193 (1993). 

[23] M.Chaichian, D.Louis Martinez and L.Lusanna, Ann.Phys.(N.Y.)232, 40 (1994). 

[24] E.Noether, Nachr.Ges.Wiss.Gottingen, Math^Phys.Kl.H. 2, 235 (1918); English translation 
in Transp.Theory Stat.Phys. 1, 183 (1971). J.D.Logan, “Invariant Variational Principles” 
(Academic, New York, 1979). N.P.Konopleva and V.N.Popov, “Gauge Fields” (Harwood, 
New York, 1981). B.M.Barbashov and V.V.Nesterenko, Fortschr.Phys. 31, 535 (1983). 

[25] LA.Batalin and G.A.Vilkoviski, Nucl.Phys. B234, 106 (1984). 

[26] S.Shanmugadhasan, J.Math.Phys. 14, 677 (1973). 

[27] J.A.Schouten and W.V.D.Kulk, “Pfaff’s Problem and Its Generalizations” (Clarendon, Ox- 
ford, 1949). 

[28] S.Lie, “Theorie der Transformation Gruppe”, Vol. II (B.G.Teubner, Leipzig, 1890). 
A.R.Forsyth, “Theory of Differential Equations”, Vol. V, Ch. IX (Dover, New York, 1959). 
L.P.Eisenhart, “Continuous Groups of Transformations (Dover, New York, 1961). R.O.Fulp 
and J.A.Marlin, Pacific J. Math. 67, 373 (1976); Rep.Math.Phys. 18, 295 (1980). 

[29] L.D.Faddeev and Popov, Phys.Lett. B25, 30 (1967). 

[30] M.Henneaux, Phys.Rep. 126, 1 (1985). 

[31] J.M.Souriau, “Structure des syste’mes dynamiques” (Dunod, Paris, 1970). B.Konatant, 
“Quantization and Unitary Representations”, Lecture Notes Math. 170 (Springer, Berlin, 
1970). J.E.Marsden and A. Weinstein, Rep.Math.Phys. 5, 121 (1974). 

[32] G.Longhi and L.Lusanna, Phys.Rev. D34, 3707 (1986). 

[33] A.Lucenti, L.Lusanna and M.Pauri, J.Phys. A31, 1633 (1998). 

[34] J.M.Arms, J.E.Marsden and V.Moncrief, Commun.Math.Phys. 78, 455 (1981). J.M.Arms, 
Acta Phys.Pol. BIT, 499 (1986). L.Bos and M.J.Gotay, J.Diff.Geom. 24, 181 (1986). 

[35] P.A.M.Dirac, Can.J.Phys. 33, 650 (1955). 

[36] R-Casalbuoni, Nuovo Cimento 33 A, 115 and 389 (1976). F.A. Berezin and M.S.Marinov, 
Ann.Phys.(N.Y.) 104, 336 (1977). A.Barucci, R.CasaIbuoni and L.Lusanna, Nuovo Cim.Lett. 
19, 581 (1977); Nucl.Phys. B124, 93 (1981) and B180[FS2], 141 (1981). 

[37] A.Barducci, R.CasaIbuoni and L.Lusanna, Nuovo Cim. 35A, 377 (1976). 

[38] A.Barducci, R.CasaIbuoni, D.Dominici and L.Lusanna, Phys.Lett. lOOB, 126 (1981). 

[39] A.Barducci and L.Lusanna, Nuovo Cim, 77 A, 39 (1983). 

[40] A.Barducci and L.Lusanna, J.Phys. A16, 1993 (1983). 

[41] Ph.Droz Vincent, Lett.Nuovo Cim. 1, 839 (1969); Phys.Scr. 2, 129 (1970); Rep.Math.Phys.8, 
79 (1975). I.T.Todorov, Report Comm. JINR E2-10125,Dubna 1976 (unpublished); 
Ann.Inst.H.Poincare 28A, 207 (1978). A.Komar, Phys.Rev. D18, 1881 and 1887 (1978). 

[42] A.Barducci, R.CasaIbuoni and L.Lusanna, Nuovo Cim. 54A, 340 (1979). 



Dirac-Bergmajin Constrained Dynamics 


515 


[43] H.Sazdjian, Ann.Phys.(N.Y.) 136, 136 (1981); Phys.Rev. D33, 3401 (1986); J.Math.Phys. 28, 
2618 and 1988 (1987), 29, 1620 (1987); Ann.Phys. (N.Y.) 191, 52 (1989); in Proc.Int. Symp. 
“Extended Objects and Bound Systems”, eds. O.Haxa, S.Ishida and S.Naka (World Scientific, 
Singapore, 1992). J.Bijtebier and J.Brockaert, Nuovo Cim. A105, 351 and 625 (1992); in 
Proc.Int.Symp. “Extended Objects and Bound Systems”, eds. O.Hara, S.Ishida and S.Naka 
(World Scientific, Singapore, 1992). 

[44] H.W.Crater and P.Van Alstine, J.Math.Phys. 23, 1697 (1982); Ann.Phys.(N.Y) 148, 57 
(1983); Phys.Rev.Lett. 53, 1577 (1984); Phys.Rev. D30, 2585 (1984); D34, 1932 (1986); 
D36, 3007 (1987); D37, 1982 (1988); J.Math.Phys. 31, 1998 (1990); Phys.Rev. D46, 766 
(1992). H.W.Crater, R.L. Becker, C.Y.Wong and P.Van Alstine, Phys.Rev. D46, 5117 (1992); 
in Proc.Int.Symp. “Extended Objects and Bound Systems”, eds. O.Hara, S.Ishida and S.Naka 
(World Scientific, Singapore, 1992). H.W.Crater and D.Yang, J.Math.Phys. 32, 2374 (1991). 

[45] L.Lusanna, Nuovo Cim. 64A, 65 (1981). 

[46] G.Longhi, L.Lusanna and J.M.Pons, J.Math.Phys. 30, 1893 (1989). 

[47] F.Colomo, G.Longhi and L.Lusanna, Int.J.Mod.Phys. A5, 3347 (1990); Mod.Phys.Letters 
A5, 17 (1990). F.Colomo and L.Lusanna, Int.J.Mod.Phys. A7, 1705 and 4107 (1992). 

[48] L.Lusanna, Int.J.Mod.Phys. AlO, 3531 and 3675 (1995). 

[49] H.Cendra, A.Ibort and J.Marsden, J.Geom.Phys. 4, 183 (1987). A.P.Balachandran, G.Marmo, 
B.S.Skagerstam and A.Stern, “Classical Topology and Quantum States” (World Scientific, 
Singapore, 1991). 

[50] V.Moncrief, J.Math.Phys. 20, 579 (1979). M.Cantor, Bull.Am.Math.Soc. 5, 235 (1981). 

[51] L.Lusanna and P.Valtancoli, Int.J.Mod.Phys. A12, 4769 (1997) (HEP-TH/9606078) and 
Int.J.Mod.Phys. A12, 4797 (1997). (HEP-TH/9606079). 

[52] L.Lusanna and P.Valtancoli, Int.J.Mod.Phys. A13, 4605 (1998) (HEP-TH/9707072). 

[53] K.Kuchar, J.Math.Phys. 17, 777, 792, 801 (1976); 18, 1589 (1977). 

[54] P.A.M.Dirac, Rev.Mod.Phys. 21 (1949) 392. 

[55] L.Lusanna, Int.J.Mod.Phys. A12, 645 (1997). 

[56] M.Pauri and M.Prosperi, J.Math.Phys. 16, 1503 (1975). 

[57] L.Lusanna and M.Materassi, “The Canonical Decomposition in Collective and Relative Vari- 
ables of a Klein-Gordon Field in the Rest-Frame Wigner-Covariant Instant Form” , Firenze 
Univ.preprint (HEP-TH/9904202). 

[58] G.Longhi and M.Materassi, J.Math.Phys. 40, 480 (1999) (HEP-TH/9803128); “Collective 
and Relative Variables for a Classical Klein-Gordon Field”, Firenze Univ.preprint (HEP- 
TH/9890024), to appear in Int.J.Mod.Phys. A. 

[59] W.G.Dbcon, J.Math.Phys. 8, 1591 (1967). “Extended Objects in General Relativity: their 
Description and Motion”, in “Isolated Gravitating Systems in General Relativity”, ed.J.Ehlers 
(North-HoUand, Amsterdam, 1979). 

[60] D.Alba and L.Lusanna, Int.J.Mod.Phys. A13, 2791 (1998) (HEP-TH/9705155). 

[61] D.Alba and L.Lusanna, Int.J.Mod.Phys. A13, 3275 (1998) (HEP-TH/9705156). 

[62] F.Bigazzi and L.Lusanna, Int.J.Mod.Phys. A14, 1429 (1999) (HEP-TH/9807P52). 

[63] F.Bigazzi and L.-Lusanna, Int.J.Mod.Phys. A14, 1877 (1999) (HEP-TH/9807054). 



516 


Luca Lusaima 


[64] CXammerzahl, J.Math.Phys. 34, 3918 (1993). 

[65] I.Herbst, Commun.Math.Phys. 53, 285 (1977); 55, 316 (1997). 

B. and L. Durand, Phys.Rev. D28, 396 (1983); erratum Phys.Rev. D50, 6642 (1994). 
J.J.Basdevant and S.Boukraa, Z.Phys. C28, 413 (1985). 

A.Martin and S.M.Roy, Phys.Lett. B233, 407 (1989). 

A.LeYaouanc, L. Oliver and J.C.Raynal, Ann.Phys.(N.Y.) 239, 243 (1995). 

W.Lucha and F.F.Schoberl, Phys.Rev. D50, 5443 (1994). 

[66] G.Leibbrandt, “Non-Covariant Gauges”, ch.9 (World Scientific, Singapore, 1994). 

[67] C.M0ller, Ann.Inst.H.Poincare 11, 251 (1949); “The Theory of Relativity” (Oxford 
Univ.Press, Oxford, 1957). 

[68] G.Veneziano, “Quantum Strings and the Constants of Nature”, in “The Challenging Ques- 
tions”, ed.A.Zichichi, the Subnuclear Series n.27 (Plenum Press, New York, 1990). 

[69] L.Lusanna and S.Russo, “Tetrad Gravity I): A New Formulation”, Firenze Univ. preprint 
1998 (GR-QC/9807073). 

[70] L.Lusanna and S.Russo, “Tetrad Gravity II): Dirac’s Observables”, Firenze Univ. preprint 
1998 (GR-QC/9807074). 

[71] R.DePietri and L.Lusanna, “Tetrad Gravity III): Asymptotic Poincare Charges, the Physical 
Hamiltonian and Void Spacetimes”, in preparation. 

[72] R.DePietri, L.Lusanna and M.Vallisneri, “Tetrad Gravity IV): The N-body Problem”, in 
preparation. 

[73] R.Arnowitt, S.Deser and C.W.Misner, Phys.Rev. 117, 1595 (1960); in “Gravitation: an In- 
troduction to Current Research”, ed.L. Witten (Wiley, New York, 1962). 

[74] T.Regge and C.Teitelboim, Ann.Phys.(N.Y.) 88, 286 (1974). R.Beig and 6 Murchadha, 
Ann.Phys.(N.Y.) 174, 463 (1987). L.Andersson, J.Geom.Phys. 4, 289 (1987). 

[75] A. Ashtekar, “Asymptotic Structure of the Gravitational Field at Spatial Infinity” , in “General 
Relativity and Gravitation”, Vol. 2, ed.A.Held (Plenum, New York, 1980). A. Ashtekar and 
RO.Hansen, J.Math.Phys. 19, 1542 (1978). A.Ashtekar and A.Magnon, J.Math.Phys. 25, 
2682 (1984). A.Ashtekar and J.D.Romano, Class.Quantum Grav. 9, 1069 (1992). 

[76] R.M.Wald, “General Relativity” (Chicago Univ.Press, Chicago, 1984). 

[77] R.Geroch, J.Math.Phys. 9, 1739 (1968); 11, 343 (1970). 

[78] V.Moncrief, J.Math.Phys. 16, 1556 (1975). 

[79] Y.Choquet-Bruhat, A.Fischer and J.E.Marsden, “Maximal Hypersurfaces and Positivity of 
Mass”, LXVII E.Fermi Summer School of Physics “Isolated Gravitating Systems in General 
Relativity”, ed.J.Ehlers (North-Holland, Amsterdam, 1979). 

[80] P.G.Bergmann, Rev.Mod.Phys. 33, 510 (1961). 

[81] H.Weyl, Z.Physik 56, 330 (1929). J.Schwinger, Phys.Rev. 130, 1253 (1963). T.W.B.Kibble, 
J.Math.Phys. 4, 1433 (1963). S.Deser and C.J.Isham, Phys.Rev. D14, 2505 (1976). J.E.Nelson 
and C.Teitelboim, Ann.Phys.(N.Y.) 116, 86 (1978). M.Pilati, Nucl.Phys. B132, 138 (1978). 
J.E,Nelsonand T.Regge, Ann.Phys.(N.Y.) 166, 234 (1986); Int.J.Mod.Phys. A4, 2021 (1989). 
J.M.Charap and J.E.Nelson, J.Phys. A16, 1661 and 3355 (1983). Class.Quantum Grav. 3, 
1061 (1986). 

J.M.Charap, “The Constraints in Vierbein General Relativity”, in “Constraint’s Theory 
and Relativistic Dynamics”, eds. G.Longhi and L.Lusanna (World Scientific, Singapore, 



Dirac-Bergmann Constrained Dynamics 


517 


1987). M.Hemieaux, Gen.Rel.Grav. 9, 1031 (1978). M.Henneaux, Phys.Rev. D27, 986 (1983). 
J.M.Charap, M.Henneaux and J.E.Nelson, Class.Quantum Grav. 5, 1405 (1988). M.Henneaux, 

J. E.Nelson and C.Schonblond, Phys.Rev. D39, 434 (1989). 

[82] R.T.Jantzen, P.Caxini and D.Bini, Ann.Phys.(N.Y.) 215, 1 (1992). 

[83] I.Ciufolini and J. A. Wheeler, “Gravitation and Inertia” (Princeton Univ.Press, Princeton, 
1995). 

[84] J.L.Synge, “Relativity: the General Theory” (North-Holland, Amsterdam, 1960). B.S.De 
Witt, Phys.Rev. 162, 1195 (1967); “The Spacetime Approach to Quantum Field Theory”, 
in “Relativity, Groups and Topology II”, Les Houches 1983, eds. B.S.DeWitt and R.Stora 
(North-Holland, Amsterdam, 1984). 

[85] B.S.De Witt, Phys.Rev. 160, 1113 (1967). 

[86] C.J.Isham and K.Kuchar, Ann.Phys.(N.Y.) 164, 288 and 316 (1984). 

K. Kuchar, Found.Phys. 16, 193 (1986). 

[87] A.Lichnerowicz, J.Math.Pure Appl. 23, 37 (1944). 

Y.Choquet-Bruhat, C.R.Acad.Sci.Paris 226, 1071 (1948); J.Rat.MecRAnal. 5, 951 (1956); 
“The Cauchy Problem” in “Gravitation: An Introduction to Current Research” , ed.L. Witten 
(Wiley, New York, 1962). 

[88] J.W.York jr, Phys.Rev.Lett. 26, 1656 (1971); 28, 1082 (1972). J.Math.Phys. 13, 125 (1972); 
14, 456 (1972). Ann.Ins.H.Poincare XXI, 318 (1974). 

N.O’Murchadha and J.W.York jr, J.Math.Phys. 14, 1551 (1972). Phys.Rev. DIO, 428 (1974). 

[89] J.W.York jr., “Kinematics and Dynamics of General Relativity”, in “Sources of Gravita- 
tional Radiation”, Battelle-Seattle Workshop 1978, ed.L.L.Smarr (Cambridge Univ.Press, 
Cambridge, 1979). 

[90] A.Qadir and J. A. Wheeler, “York’s Cosmic Time Versus Proper Time”, in “Prom SU(3) to 
Gravity” , Y.Ne’eman’s festschrift, eds. E.Gotsma and G.Tauber (Cambridge Univ.Press, Cam- 
bridge, 1985). 

[91] P.A.M.Dirac, Canad.J.Math. 3, 1 (1951). 

[92] D.Christodoulou and S.Klainerman, “The Global Nonlinear Stability of the Minkowski Space” 
(Princeton Univ. Press, Princeton, 1993). 

[93] A.Ashtekar and G.T.Horowitz, J.Math.Phys. 25, 1473 (1984). 

[94] E. Witten, Commun.Math.Phys. 80, 381 (1981). 

[95] A.Sen, J.Math.Phys. 22, 1781 (1981); Phys.Lett. 119B, 89 (1982). 

[96] J.Prauendiener, Class. Quantum Grav. 8, 1881 (1991). 

[97] C.Teitelboim, “The Hamiltonian Structure of Space-Time”, in “General Relativity and 
Gravitation”, ed.A.Held, Vol.I (Plenum, New York, 1980). A.S.Hojman, K.Kuchar and 
C.Teitelboim, Ann.Phys. (N.Y.) 96, 88 (1971). 

[98] P.Havas, Gen.Rel.Grav. 19, 435 (1987). R. Anderson, I.Vetharaniam and G.E.Stedman, 
Phys.Rep. 295, 93 (1998). 

[99] A.Komar, Phys.Rev. Ill, 1182 (1958). P.G.Bergmann and A.Komar, Phys.Rev.Lett. 4, 432 
(1960). 

[100] L.Landau and E.Lifschitz, “The Classical Theory of Fields” (Addison- Wesley, Cambridge, 
1951). 



518 


Luca, Lusanna 


[101] G.F.R.Ellis and D.R.Matravers, Gen.Rel.Grav. 27, 777 (1995). R.Zalaletdinov, R.Tavakol 
and G.F.R.ElIis, Gen.Rel.Grav. 28, 1251 (1996). 

[102] R.DePietri, L.Lusanna and M.Pauri, Class.Quantum Grav, 12, 219 (1995). 

[103] A.Einstein, B.Hofl6nan and L.Infeld, Ann.Math. 39, 66 (1938). A. Einstein and L.Infeld, 
Ann.Math. 41, 797 (1940); Canad.J.Math. 1, 209 (1949). L.Infeld, Rev.Mod.Phys. 29, 398 
(1957). 

[104] H.A.Lorentz and L.Droste, Amst.Akad.Versl. 26, 392 (1917). A.Eddington and G.L.Claxke, 
Proc.Roy.Soc.London A166, 465 (1938). V.Fock, J.Phys. (U.S.S.R.) 1, 81 (1939). 
A.Papapetrou, Proc.Phys.Soc. (London) 64, 57 (1951). 

[105] A.Ashtekar, Phys.Rev.Lett. 57, 2244 (1986); “New Perspectives in Canonical Grav- 
ity” (Bibliopolis, Naples, 1988); “Lectures on Non-Perturbative Canonical Gravity” 
(World Scientific, Singapore, 1991); “Quantum Mechanics of Riemannian Geome- 
try”, http://vishnu.nirvana.phys.psu.edu/ riem_qm/riem_qm.html. C.Rovelli and L.Smolin, 
Nucl.Phys. B331, 80 (1990); B442, 593 (1995). C.Rovelli, ’’Loop Quantum Gravity”, Living 
Reviews in Relativity http://www.livingreviews.org/Articles/Volumel/1998-lrovelli. 



Part D : Extension Of QFT Frontiers 

20. Supersymmetry And Particle Physics by R.N.Mohapatra 

21. Supersymmetry In Field Theory by N.Sakai 

22. Conformal Field Theory: A Bridge Over Troubled Waters by Werner Nahm 

23. Superstring Theory - An Overview by John H Schwarz 

24. Recent Developments In String Theory by J.Maharana 

25. Yang-Mills Theory And Matrix String Theory by L.Bonora 




20. Supersymmetry and Particle Physics 


R. N. Mohapatra * 

Department of Physics, University of Maryland, 

College Park, MD, 20742 , USA 


Abstract 

A pedagogical overview of the recent developments in the area of supersymmetry and its 
applications to particle physics is presented. 


1 Introduction 

All elementary particles in nature fall into two clases; bosons and fermions. The bosons have 
integral spin whereas the fermions have half-odd integral spin. The different statistics obeyed by 
the bosons and fermions provides a fundamental distinction between them. It was due to this 
basic dissimilarity that physicists until the mid seventies thought it impossible to have a symmetry 
which could link the two fundamental species of particles. Thus it was feared that no matter how 
unified the ultimate theory of matter and forces becomes, the bosons and fermions would for ever 
remain distinct. 

It therefore caused great exhilaration and anticipation when theorists in the mid seventies 
announced that local field theories with supersymmetry, a symmetry that unites the fermions and 
bosons together, can not only be written down but they exhibit much more interesting properties 
than the existing field theories without such symmetries. It is the purpose of this article to provide 
a pedagogical overview of some of these astounding developments in supersymmetric field theories 
and their applications in resolving several important issues in particle physics. 

Supersymmetry was introduced in the early 1970’s in the two dimensional field theories by 
Gervais and Sakita and in the context of four dimensional ones by Golfand, Likhtman, Akulov, 
Volkov, Wess and Zumino and has become one of the most active areas of research in the eighties 
and nineties. As the millenium comes to a close, in addition to the obvious fact that it provides 
the hope of an unified understanding of the two known forms of matter, the bosons and fermions, 
the versatility of supersymmetry as a tool to understand many unsolved problems of physics seems 
to be continuously expanding. Some of these results that will be touched in this review are (i) the 
significant improvement in the singularity structure of local field theories and the application of 
this result to understanding the disparate scales of nature such as the Fermi scale characterising 
weak interactions and the Planck scale characterising the gravitational interactions; (ii) possibility 
of unifying gravity with the rest of the forces of nature by making supersymmetry into a local 
rather than a global symmetry; (iii) finally the novel prospect of understanding nonperturbative 
properties of field theories, that was considered a practical impossibility in nonsupersymmetric 
field theories. 

Let us quickly hasten to remind the reader that the “traders of supersymmetric merchandise” 
are fully cognisant of the fact that there is no trace of even an iota of sypersymmetry in observed 
spectrum of particles. This fact instead of being discouraging is accepted as a challenge that in 
itself might more clearly elucidate the nature of supersymmetry and its various links with other 

* Email: rmohapat@katherine.physics.umd.edu. 



522 


R N Mohapatra 


known aspects of physics such as gravity or weak forces. We will therefore spend a full section or 
two on the various ways that supersymmetry breaking is supposed to emerge. This is a rapidly 
growing field and our discussion of this aspect will therefore necessarily reflect the tentativeness of 
this field. 

This review is organized as follows: in section 2 , the introductory ideas of supersymmetric 
field theories are presented; in section 3, techniques for writing down the simple supersymmetric 
field theories are given; section 4 deals with some properties of N =1 super-field theories; section 5, 
considers the possibility of local supersymmetry and how that brings gravity into the same picture 
with other interactions of nature, providing the hope for an ultimate unification of all forces 
including gravity; section 6 discusses breaking of supersymmetry; section 7 reviews some current 
speculations on the origin of supersymmetry breaking; in section 8 , we discuss the application of 
supersymmetry to particle physics models- i.e. minimal supersymmetric extension of the standard 
model (MSSM); section 9 discusses the need to go beyond the MSSM; section 10 discusses, the 
supersymmetric left-right model which cures several of the problems of MSSM and provides a way to 
include neutrino masses into particle physics; section 11 present the motivation for supersymmetry 
based on the hypothesis of grand unification of all forces of nature; section 12 discusses possible 
applications of strongly coupled super- Yang-Mills theories to build composite models of quarks and 
leptons; section 13 notes the connection between string theory and supersymmetry and section 14 
gives the conclusions. 

This article is meant to be a pedagogical overview of the field; therefore there are not too many 
details; many of the discussions are very telegraphic so as to only convey the spirit of the activity 
in the field and the reader should take this article more as an “appetizer” rather than a “full course 
dinner” and I will consider my efforts immensely rewarded if I manage to whet a few appetites. 
There are many excellent books and reviews for those who want to go deeper. 


2 Superspace and superfields 

There are many ways to introduce supersymmetric field theories and we refer to several texts [ 1 , 2] 
and reviews [3, 4] for more thorough discussion. Here we will content ourselves with a brief outline 
noting a few salient points. 

Since supersymmetry transforms a boson to a fermion and vice versa, an irreducible represen- 
tation of the supersymmetry “group” contains in it both fermions and bosons. Therefore in a 
supersymmetric theory, all known particles are accompanied by a superpartner which is a fermion 
if the known particle is a boson and vice versa. For instance, the electron (e) supermultiplet will 
contain its superpartner e, (called the selectron) which has spin zero. The photon ( 7 ) supermul- 
tiplet will contain its superpartner, 7 , known as the photino, which is a Majorana field of spin 

We will adopt the notation that the superpaxtner of a particle will be denoted by the same 
symbol as the particle with a ‘tilde’ as above. The name of the superpartner of any known fermion 
will have an extra letter ‘s’ at the beginning, and the names of superpartners of known bosons 
will end with the letters ‘-ino’. Furthermore, while supersymmetry does not commute with the 
Lorentz transformations, it commutes with ail internal symmetries; as a result, all non-Lorentzian 
quantum numbers for both the fermion and boson in the same supermultiplet are the same. For 
example, all particles in a supersymmetry multiplet have the same electric charge, same isospin, 
same color quantum numbers etc. 

A convenient mathematical way to discuss supersymmetry is to extend space-time to a super- 
space where one augments the normal spacetime with the addition of anticommuting Grassmanian 
cordinates which due to Lorentz covariance can be chosen to form a Majorana spinor. Such spinors 
have four independent real components or two independent complex components. In what follows 
we will use the notation of complex two component spinors denoted by 6 and its complex con- 
jugate 9. Thus we have exactly the same number of fermionic (or grassmanian) co-ordinates as 
the number of space time co-ordianates. Models based on this extended space time are called 
N =1 supersymmetric theories. Prom this nomenclature it is clear that there can be higher (or 
extended) supersymmetric theories where for each space-time co-ordinate there are larger number 



Supersymmetry and Particle Physics 


523 


of fermionic variables. 

The two 2“Component spinors transform as two dimensional representations of the SL(2, C) 
group. This observation has the hidden implication that any function of 0 and 6 will maintain 
Lorentz covariance (and invariance) of the field theories constructed out of them long as we use 
operators that have proper transformation under the SL(2, C) group^ 

Operator valued functions in the superspace will be called superfields and denoted by ^ 0, 0). 

The Grassmanian property of the 0 coordinates (i.e., 6x02 = -0201 and (Oi)^ = 0) then implies that 
an expansion of the superfields $ in the 0 coordinate terminates after a few terms. The coefficients 
of such an expansion will be functions of x only, and are therefore fields of the ordinary field theory. 
Those fields will create and destroy the known particles and their superpartners. The superfields 
can of course have any spin but for the description of the particles of the standard model, it is 
enough to consider only those superfields with spin zero. 

To see the detailed field theory content of the general superfields, let us expand a general 
superfield ^(x^0,0) in the grassmanian variables: 

$(x, 0,0) = <l) + 0'ip-i-0x + 0^M -\-0^N-h ea^0V^ (1) 

0^0\+p0(: + eWD 

where cr^ are the usual Pauli matrices for the space components and unit matrix for the zeroth 
component. Note that the fermionic and bosonic fields in the above equation are complex. As a 
result, simple counting enables one to see that there are 16 real bosonic and same number of real 
fermionic fields. This equality in the number of the bosonic and fermionic fields will be a constant 
feature of supersymmetric field theories and play a crucial role in the improved divergence structure 
of the supersymmetric field theories alluded to above. 

The set of fields in Eq. (1) provide a representation multiplet of N=1 supersymmetry. We 
will see below however that this is not an irreducible representation. To see this let us define the 
operation of supersymmetry transformation. Supersymmetry is a translation in the fermionic part 
of the superspace i.e. under supersymmetry, 0 0 + e and similarly for 0 where c is a constant 

spinor for the case of global supersymmetry (and a function of space time for the case of local 
supersymmetry). A key ingredient of supersymmetry transformation is that under supersymmetry 
transformation described above, the bosonic space coordinates transform as x^ i[0(T^e- ea^0). 
It is then easy to verify that the generators of supersymmetry are: 

( 2 ) 

and similarly for Q. The supersymmetry algebra can therefore be written as 

{Q,Q} = c>^P^ (3) 

In order to construct irreducible representations of supersymmetry, we have to ^d operators 
that commute with Q qnd Q. There is a pair of such operators called V and V where T> = 
^ 4- and similar ly for V. One may therefore impose constraints on the representations 

such as the $ described above by demanding that = 0 or P# = 0. These constraints relate the 
different component fields in the expansion of $ discussed above. One may also put constraints 
such as = 0 etc. to get other irreducible representations. 

Using the above operators, we can deduce three kinds of spin zero superfields which form 
irreducible representations of supersymmetry; (i) the chiral and anti-chiral superfields, which 
result when it is required that either D ov P annihilate the field $ (ii) vector superfields which 
result when it is demanded that $ = and (iit) finally linear multiplets result when one requires 
that or annihilate the # field. Again for our purpose, we only need the first two kinds of 
multiplets. The chiral (and anti-chiral) multiplets contain a spin zero and a spin half particle and 

^ Since we visualize the 9 and 9 as extra “space” variables, we can_define a “calculus” on these variables. This 
is done by apfini n g the f d9 = 0; f 9d9 = 1 and similarly for the 9. Differentiation is as usual except for the 
anticommuting nature of the differential operator. 



524 


R N Mohapatra 


will be used to describe matter as well as Higgs fields. On the other hand the vector multiplets 
contain a spin one and a spin half fields and will be used to describe gauge fields. It turns out that 
real vector fields allow transformations, which can be identified with the gauge transformations as 
is needed if they are to describe gauge fields. 

For completeness, let us introduce the gauge transformations of chiral and the vector (gauge) 
fields for the abelian example: 


$ (4) 

y ^ y + A + 

Note that A is a chiral superfield and therefore has not only the familiar gauge parameter of QED 
which is a real function of space^time, but also a fermionic field and another real field (Im C(x) 
where C(x) is the scalar component of the superfield A). These extra functions can be used to 
set the scalar, 0, 0, 6^ and 6^ components of V to zero. This is called the Wess-Zumino gauge, in 
which the vector field looks like 

V = (5) 


3 The Lagrangian for supersymmetric field theories 


In order to write down the action for a supersymmetric field theory, let us start by considering 
generic chiral fields denoted by ^(x^d) with component fields given by {(pj'ip) and gauge fields 
denoted by V(x,9,d) with component gauge and gaugino fields given by The action in 

the superfield notation is 


J d^x J + j d^x J (f9[W{§) + W‘^{V)Wcc{V) + h.c.] . 


( 6 ) 


In the above equation, the first term gives the gauge invariant kinetic energy term for the matter 
fields W(^) is a holomorphic function of # and is called the superpotential; it leads to the 
Higgs potential of the usual gauge field theories. We wish to emphasize the holomorphy of the 
superpotential, which is a major diflFerence from the potentials in nonsupersymmetric field theories. 
This fact has a number of interesting implications that we discuss below. 

Secondly, = V‘^'D^V where V = 0$ Aia.dx as defined above, and the term involving 

W’^(V') leads to the gauge invariant kinetic energy term for the gauge fields as well as for the 
gaugino fields. In terms of the component fields the lagrangian can be written as 


C — Cg + Cfnatter ‘h ^ (^) 


( 7 ) 


where 


^matter ~ "t" nljj 

Cy = + tpai’bWab 

V{(j>) = + (8) 

where stands for the covariant derivative with respect to the gauge group and Va stands for 
the so-called P-term and is given by Va = g(j^Ta(j> {g is the gauge coupling constant and Ta 
are the generators of the gauge group). Wa and Wah are the first and second derivative of the 
superpotential W with respect to the superfield with respect to the field where the index o 
stands for different matter fields in the model. 

Several new features of a supersymmetric field theories (as compared to a non-supersymmetric 
onej are evident from the above equation. First, note that the Yukawa couplings and the first 



Supersymmetry and Particle Physics 


525 


contribution to the Higgs potential arise from the same function W{^) and therefore their param- 
eters are related. The second term in the Higgs potential involves only the gauge coupling. Thus 
the number of parameters in a SUSY field theory are expected to be fewer than the usual gauge 
theories. Secondly, there is a new kind of Yukawa coupling involving the gaugino, matter fermion 
xjj and the superpartner of the matter field with the Yukawa coupling given by the gauge coupling 
constant g. These generic features have many important implications for the phenomenolog}" of 
these models. 

To see one immediate phenomenological implication of supersymmetry, consider a superpoten- 
tial TY($) = m$^. Using Eq.(8), it is very easy to see that the scalar field (j) and the fermion field 
'0 both have the same mass. This is again a generic prediction of the SUSY models, i.e. the par- 
ticle and its superpartner have the same mass in the supersymmetric limit. This is clearly against 
observations for all known particles implying that in a realistic model, supersymmetry must be 
broken. As in the case of other global symmetries, supersymmetry breaking can be explicit or 
spontaneous. In the latter case, the analogs of Nambu-Goldstone theorem and the Higgs-Kibble 
mechanism imply that if supersymmetry is global and is spontaneously broken (i.e., Qsusyl^) ^ 0), 
then there must exist a massless fermion in the particle spectrum (to be called the Goldstino) and 
it will obey low energy theorems — the analog of Adler’s zeros for the pion. Since no particle with 
these properties is known to exist in nature, we have to assume that supersymmetry is either explic- 
itly broken or more elegantly, supersymmetry is a local symmetry which is spontaneously broken. 
In the latter case, the analog of the Higgs-Kibble mechanism for local symmetries leads to the 
conclusion that the Goldstino becomes the longitudinal mode of the gauge particle corresponding 
to the local supersymmetry^. We will discuss this further in a subsequent section. 

4 Some properties of exact N=1 supersymmetric theories 

In this section, we review some of the new properties of supersymmetric field theories. We will 
only state the results without giving the detailed mathematical proofs and we refer the reader to 
the original Uterature for the proofs. 


IVa. Non-renormalization theorem: 


It is well known that the only successful way to get useful information firom ordinary local 
field theories is to work in the limit of weak couplings (typically, < 1, where A is an 

interaction strength in the theory) and use the perturbative solutions of the field equations. The 
singularity of the products of field operators then leads to divergences when one goes beyond the 
tree level approximations (i.e. to include Feynman diagrams that includes loops). This leads to 
a division of field theories into two classes; one class, known as renormalizable field theories is 
where the number of operators that emerge from loop calculations with divergences is finite to 
all orders in perturbation theory and the other class is known as nonrenormalizable theories and 
consists of theories where the number of operators that are accompanied by infinities proliferates 
without limit. In applications to physics applications (such as in particle physics) one chooses 
renormalizable field theories since in these theories, all infinities can be absorbed by redefining a 
finite number of observable parameters which then become the input parameters of the theory. 
They receive infinite renormalization in each loop order and if it is assumed that theory is an 
“effective” theory valid below some scale, the infinity is replaced by a dependence on the cutoff 
scale A below which the “effective” theory holds. Their values therefore cannot be chosen at the 
tree level. In such theories, however, there are other physical quantities that can be predicted in 
terms of these input parameters. This provides a way to experimentally test whether a given field 

^Recall that Higgs-Kibble mechanism for ordinary continuous local symmetries refers to the phenomenon that if 
the symmetry is spontaneously broken, the associated massless Nambu-Goldstone boson becomes the longitudinal 
mode of the vector gauge boson (which in the beginning had only two states of polarization and was massless like 
the photon) to make it a massive gauge boson. 



526 


R N Mobapatra 


theory describes natural phenomena. The most illustrious example of this kind of theory is the 
quantum electrodynamics where the input parameters are the elctron mass and charge and most 
recently the standard model of electroweak interactions which has more input parameters. In these 
models, the higher order corrections have actually been experimentally measured (e.g. ^ — 2 of 
muons and electrons) proving them as fundamental theories of nature. 

While the above argument implies that the nonrenormalizable theories may not have anything 
to do with reality, this need not necessarily be the case. For instance, it may be that a non- 
renormalizable theory may describe phenomena observed in nature, a well known example being 
Einstein’s theory of general relativity. In such cases, one may hope that they are “effective” field 
theories which are “low” energy manifestations of some deeper theory. One should not therefore 
discard a theory as uninteresting just because it is apparently nonrenormalizable. 

As hinted earlier, supersymmetric theories by virtue of the extra symmetries in them have much 
“softer” divergence structure. A very important property of supersymmetric field theories is their 
ultraviolet behavior. One particular aspect of this softer divergence structure is the extremely 
important property that in the exact supersymmetric limit, the parameters of the superpotential 
PF($) do not receive any (finite or infinite) corrections from Feynman diagrams involving the 
loops. In other words, if the value of a superpotential parameter is fixed at the classical level, it 
remains unchanged to all orders in perturbation theory. Furthermore, if W (^) is a polynomial 
of order less than or equal to three, then there is only one divergent counter term in the theory 
which is absorbed by a redefinition of the kinetic energy term in the Lagrangian. Thus theory is 
renormalizable in the conventional sense but with only one class of infinities. Note of course that if 
W ($) is at most a cubic polynomial in then the corresponding field theory is a renormalizable 
field theory. This is known as the non-re^rmalization theorem [5]. This makes supersymmetric 
field theories far more appealing as theories of nature since the input parameters of the theory do 
not have to be defined at every loop order and everytime we encounter a threshold for new physics. 

This observation was realized as the key to solving the Higgs mass problem of the standard 
model as follows: the radiative corrections to the Higgs mass in the standard model are quadrat- 
ically divergent and admit the Planck scale as a natural cutoff if there is no new physics upto 
that level. Since the Higgs mass is directly proportional to the mass of the PF-boson, the loop 
corrections would push the PF-boson mass to the Planck scale destabilizing the standard model. 
On the other hand in the supersymmetric version of the standard model (to be called MSSM), in 
the limit of exact supersymmetry, there are no radiative corrections to any mass parameter and 
therefore to the Higgs boson mass which is a parameter of the superpotential. Thus if the world 
could be supersymmetric at all energy scales, the weak scale stability problem would be easily 
solved. However, since supersymmetry must be a broken symmetry, one has to ensure that the 
terms in the hamiltonian that break supersymmetry do not spoil the non-renormalization theorem 
in a way that infinities creep into the self mass correction to the Higgs boson. This is precisely what 
happens if effective supersymmetry breaking terms are “soft” . We will discuss the soft breaking of 
supersymmetry in the next section. 

As a final addendum to the discussion on the renormalization theorem, we note that if PF($) 
is a polynomial involving fourth or higher order terms, then the corresponding field theory is 
nonrenormalizable. However, something very interesting happens in this case too. What happens 
is that the parameters of the superpotential (all of them) remain uneffected by the loop corrections. 
However, there are still infinite number of divergent counter terms which in the superfield language 
involve terms necessarily of the form In the ordinary field theory language, all counter 

terms involve space-time derivatives of the fields. 


IVb. Effective field theory of composite states in N = 1 super- Yang-Mills theories: 


The fundamental theory of strong interactions is now known to be described by an unbroken 
non-abelian gauge theory known as the Quantun Chromodynamics. It is an SU{Z) gauge theory 
under which quarks and antiquarks transform as 3 and T dimensional representations. The 



Supersymmetry and Particle Physics 


527 


baryons (protons and neutrons) and mesons observed in nature are assumed to be color (i.e. 
5l7(3)c) singlet bound states of this underlying theory, when the strong interactions become strong. 
In order to understand the dynamics of nuclear forces and the systems of baryons and mesons, 
one needs an effective low energy Lagrangian involving the mesons and the baryons. It has been 
argued by Weinberg and others that one way to arrive at such effective Lagrangians is to use the 
fact that the underlying QCD theory is invariant under the chiral symmetry i.e. SU{3)l x SU{3)r 
and the spectrum of pseudoscalar mesons implies that this symmetry is spontaneously broken by 
the ground state of the theory. This assumption is very helpful in understanding the dynamics of 
masons and baryons. While there are various plausible arguments that can be given in favor of 
the spontaneous breaking of chiral symmetry in QCD, it has never been satisfactorily established. 
Furthermore, the effective Lagrangians written down on the basis of symmetry arguments are at 
best approximate. The question then arises as to whether there exist any gauge theories that 
resemble QCD whose low energy Lagrangian can be written down exactly and if so can questions 
like whether spontaneous breaking of chiral symmetry actually occurs, be answered. 

It has been recently shown[6] that for N=1 supersymmetric Yang-Mills theories, an exact 
effective Lagrangian can be written down for the composite color singlet states. This raises the 
hope that approaching QCD from this angle may provide answers to some of the above questions 
and a better understanding of the dynamics of baryons and mesons. 

It may be instructive to give an example of this class of models. Consider an iV = 1 SU{Nc) 
gauge theory with left and right handed “quarks” (denoted by Q and Q^) transforming as Nc and 
N* representations of the group. Consider Nf families of such representations with Nf < Nc. This 
model has the symmetry G = 5t/(iV/)x x SU{Nf)R x U(1)a x U{1)b x U{1)r under which the 
matter fields transform as: Q (Nf, 1, 1, 1, Here the first four 

groups are familiar hadronic symmetry groups i.e. chiral SU (3)l x SU(S)r (for three quark flavors); 
axial U(l) and baryon number symmetry whereas the last U(1)r is typical of supersymmetry under 
which not only the fields but also the supersymmetry coordinate 6 transform nontrivially. It is the 
presence of this extra symmetry that helps to restrict the form of the composite field theory below 
the scale A of the nonabelian gauge theory defined by the mass below which the coupling becomes 
strong and composites form. The argument is that whatever the form of the composite field theory 
is, it must respect the above symmetry (in much the same fashion that one constructs the pion 
Lagrangian in the ordinary QCD). Furthermore since we are working in the supersymmetric limit, 
the form of the effective Lagrangian must be supersymmetric i.e. it must have a superpotential 
that is a holomorphic function of the composite fields which themselves must be SU(Nc) invariant. 
These restrictions i.e. invariance G and holomorphy is satisfied for the case of Nf < Nc by only 
one function of the only SU(Nc) invariant composite field Q^Q i.e. 


Weff = Cj^cNf 


f^3Nc-Nf \ {Nc-Nf) 

detQ^Q ) 


(9) 


where CNc,Nf are constants which depend on the details of renormalization[7]. It is very interesting 
that the effective composite particle theory has a unique superpotential. There could of course be 
an arbitrary form for the Kahler potential (i.e. terms of the form $^$$’^). Since the ground state 
of the theory is determined by the superpotential of the theory, this form is extremely helpful in 
understanding the nature of the composite particle theory and has been used in the literature to 
build composite models of quarks and leptons[8]. 

When one applies the above considerations to the case where Nf = iV^, one finds that the 
resulting composite field theory breaks the chiral symmetry SU{Nf)L x SU{Nf)R down to its 
vector sum group SU(Nf)v exactly as in the case of QCD. Thus one of the key properties of 
known strong interactions i.e. spontaneous breakdown of chiral symmetries is explained by this 
specific supersymmetric model It must be cautioned that we do not know whether these properti^ 
will survive to the case real QCD which does not have supersymmetry. 


IVc, Vanishing cosmological constant 



528 


R N Mohapatra 


Observations of the various cosmological phenomena seem to indicate that the universe has 
either zero or an extremely tiny cosmological constant. (The recent type I supernovae observations 
seem to imply a cosmological constant, ~ (10“^ On the other hand, spontaneously broken 

field theories which seem to be necessary to describe weak and electromagnetic interactions imply 
a large cosmological constant (246 GeF)"^, which is 10^® orders larger than the observed value). 
Similarly, QCD phase transitions also seem to imply large cosmological constants. How does one 
reconcile these theories with observations. Here supersymmetry provide a way out although no 
realistic construction of such theories have been made. 

The reason for this hope is to note that a cosmological constant in the field theories is the 
value of the potential in the ground state. On the other hand in supersymmetric field theories, the 
potential is gives by 


V{4>) = + (10) 

Note that both the terms are positive definite and therefore the minimum of this potential will 
always correspond to F($) = 0. Thus supersymmetry always implies a vanishing cosmological 
constant almost trivially. 

Of course, as we will discuss, in real life, supersymmetry is always broken; so the question then is 
how these properties get modified. As we will discuss, most of the power of the nonrenormalization 
theorem is maintained even in the presence of a certain class of supersymmetry breaking terms 
known as “soft” breaking terms. As far as the form of the low energy Lagrangian in the SUSY 
Yang-Mills theories goes, however, the situation is model dependent. If supersymmetry is broken 
spontaneously, then one can use the effective field theory approach. In other cases, the situation 
is unknown. Finally, the vanishing of cosmological constant can not be maintained in a natural 
manner once supersymmetry is broken. 


5 Local supersymmetry and Gravity 

Local symmetries seem to have played a key role in our understanding of the nature of the funda- 
mental interactions e.g. SU{2)l x U{1)y for the case of electroweak interactions and SU{3)c for 
the case of strong interactions etc. There also exists a formulation of gravity as a local symmetry 
of translation and Lorentz invariance. It is therefore natural to ask that if supersymmetry is to 
play a role in understanding the puzzles of the standard model (as for instance the Higgs mass 
alluded to before), could it be a local symmetry ? It actually turns out that this question is more 
than matter mere curiosity, as was realized in the late seventies [9]. 

As is well known, corresponding to every local symmetry, there is a gauge field. In the case 
of supersymmetry, the corresponding gauge field must have spin |. One must therefore look for 
a complete supersymmetric theory involving the spin | field. A very exciting aspect of local 
supersymmetry is that supermultiplet containing the above spin | field (to be called gravitino 
henceforth) contains the graviton (which has spin 2) as its superpartner. Of course one could 
have thought that may be instead of a spin 2 particle, one could use a spin 1 boson to complete 
the gravitino supermultiplet and we would not then have gravity in the theory. To see why 
one must necessarily have gravity in the gauge multiplet of local supersymmetry, not that the 
supersymmetry algebra involves the translation operator in the right hand side. Therefore, if 
we gauge supersymmetry, we must at least gauge translation. As we just noted, the gauge field 
corresponding to local translation is the vierbein e“. Thus local supersymmetry automatically 
leads to a supersymmetric theory that includes gravity. In fact elementary arguments can be 
used to infer that unlike the usual gauge symmetries, the coupling constant corresponding to local 
supersymmetry has mass dimension -1 and leads to a natutal identification with = « = Afp/. 
To see this explicitly, let us first recall how the gauge covariant derivative is written down for an 
ordinary local symmetry, say an U{1) symmetry. Suppose there is a scalar field which has charge 
one mder the local symmetry Then under the local symmetry transformation, we have change in 



Supersymmetry and Particle Physics 


529 


the field (j> is given by 


S(p = ia{x)(j> ( 11 ) 

where a is the infinitesimal parameter of the gauge transformation. The gauge covariant derivative 
is then given by 


D^(j){x) = d^(j> - igAf,4> (12) 

Here g is the gauge coupling constant. Note that since and d all have mass dimension 1, 

simple dimensional counting tells us that g is dimensionless. 

Now let us apply the similar considerations to local supersymmetry. Under local supersymmetry 
transformation, the members of a supermultiplet (0,^/?,^), transform among each other. We can 
write them as follows: 


5cf> = eip 
S'ljj = ea^d^<l>'^ eF 
5F = 


(13) 


If we want to write the covariant derivative for the scalar field for the case of local supersymmetry, 
then following the Z7(l) case, we would write 

Dfj, = d^(j) “ iK'ipipfj, (14) 

where is the gravitino field. Now in order to carry out dimensional counting note that all 
fermions have mass dimension | from which it follows that ac, the gauge coupling of local super- 
symmetry has dimension of inverse mass since both terms in the covariant derivative must have 
the same mass dimension. And as we just noted k = \fG where G is Newton’s constant. 

One can write down the generalization of the Lagrangian invariant under local supersymmetry [9]. 
An important point to note is that local supersymmetry brings in a new massless particle with spin 
|, the gravitino which couples to all kinds of matter by virtue of the fact that it is the superpartner 
of the graviton and graviton of course couples to all matter. However, since the garvitino coupling 
is very weak i.e. of order \/G, its presence will not felt very “strongly” by experiments. Nonetheless 
its presence gives rise to many new kinds of physical as well as astrophysical phenomena and is a 
very active area of investigation. 

In the limit of exact supersymmetry, the gravitino like the graviton is massless. But it bacomes 
massive once supersymmetry is broken (see later). 


6 Breaking of supersymmetry 

If supersymmetry was an exact symmetry of nature, we would have had equal mass superpartners 
for all known quarks, leptons and gauge bosons and they should have been discovered in the 
coUider as well as many other experiments. However, superpartner of no known particle has been 
experimentally discovered. Therefore, like many other symmetries of physics, supersymmetry must 
be broken. 

There are two ways to break a symmetry of nature: (i) by explicit nonsymmetric terms in the 
Lagrangian, as in the case of quark masses breaking global SU(2) or SU(3) symmetries among 
baryons and mesons; or (ii) by the vacuum state not being invariant under the symmetry. The 
latter mechanism is known as the Nambu-Goldstone (NG) way of breaking the symmetry and is 
always accompanied by the prediction of a massless state in the spectrum of states in the theory 
with the same quantum numbers as the broken generator of the symmetry. Since the generator of 
supersymmetry is a fermionic operator, the NG particle in this case must be a fermion. It is called 
a goldstino in the literature. Just as in the case of the NG bosons, the goldstino (denoted by the 
symbol x) coupling to particles of a supermultiplet must obey fermionic shift invariance i.e. i.e. 



530 


R N Mohapatra 


^ ^ 4- € where € is a constant grassman number. If the scale of supersymmetry breaking is given 

by A, then a typical coupling of the goldstino will be of the form 

Again since no massless fermion has been discovered to date with properties identifiable with 
the gravitino, it is believed that there must be some mechanism that makes the gravitino massive. 
One simple way out is to invoke the analog of the Higgs mechanism for local SUSY which can turn 
the Goldstino into the longitudinal mode of the massive spin 3/2 gravitino. This way both the 
massless gravitino and the massless goldstino disappear from the spectrum of the theory. We will 
assume this to be a feature of the supersymmetric theories that we will consider. 

Now let us note a very important feature that is analogous to the case of Higgs mechanism 
of normal local symmetries. In the case of the usual local symmetries , say, the U(l) symmetry 
discussed above, if there is spontaneous symmetry breaking of U(l) local symmetry, then the mass 
of the gauge field becomes g < (p >= gv where v is the scale of the symmetry breaking. In exact 
analogy, the mass of the gravitino will be kA^ since k is the analog of the gauge coupling g and 

is the analog of < </> >. Thus we see that if one could measure the mass of the gravitino, one 
would be able to determine the scale of the supersymmetry breaking. 

The practical way to implement spontaneous breaking of supersymmetry is to follow the analogy 
with the spontaneous breaking of ordinary bosonic symmetries. Suppose one wants to break a U(l) 
symmetry spontaneously. The way to do is to consider a bosonic field (p (or a composite bilinear 
of fermions) which carries, nonzero charge under the U(l) symmetry and arrange the theory such 
that < p 0. In the case of supersymmetry, we have to look for bosonic fields that carry the 
supersymmetry charge (we need bosonic fields since we do not want to break Lorentz invariance) 
and give them non zero vevs. Two examples of bosonic fields with nonzero SUSY charge are 
the F field component of a chiral superfield and the D-component of a gauge supermultiplet. If 
one chooses a superpotential such that either of these fields can acquire vev, then supersymmetry 
will be spontaneously broken. Alternatively, if one adds a linear D-term to the theory, that will 
lead to nonvanishing value for the D-term in the ground state and will lead to breakdown of 
supersymmetry. 

A simple example of the F-type supersymmetry breaking is given by the choice of a superpo- 
tential of the form: 

W = iJL^z (15) 

Note that since the F-term is obtained by taking a derivative dWjdz^ we see that = /i^ 7 ^ 0. 
Thus supersymmetry is broken. In this case the Goldstino field is the fermionic component of the 
superfield 2 i.e. tpz which is easily seen to be mass less. 

Let us now focus on the other way of breaking supersymmetry i.e. by adding explicit supersym- 
metry breaking terms to the Lagrangian. If we want to maintain the good divergence structure of 
the theory subsequent to the addition of the SUSY breaking terms, new constraints must satisfied 
by those terms - we will call the allowed terms of this restricted variety the soft SUSY breaking 
terms. Let us enumerate them below: 

1. where p is the bosonic component of the chiral superfield $a; 

2. m f (F 69^ (AW(^^(#) + where and are the second and third 

order polynomials in the superpotential. 

3. |mxA^C7"'^A, where A is the gaugino field. 

It can be shown that the soft breaking terms only introduce finite loop corrections to the parameters 
of the superpotential. Since all the soft breaking terms require couplings with positive mass 
dimension, the loop corrections to the Higgs mass will depend on this mass and we must keep 
these masses less than a TeV so that the weak scale remains stabilized. This has the interesting 
implication that superpartners of the known particles must have masses in the range of 100 GeV 
to a TeV and are accessible to the ongoing and proposed collider experiments such as the Tevatron 
at Fermilab and LHC at CERN. For a recent survey of the experimental situation, see Ref. [10]. 



Supersymmetry and Particle Physics 


531 


7 Origin of supersymmetry breaking 

An extremely important question in the application of supersymmetry to particle physics focusses 
on the origin of the soft breaking terms. Their detailed pattern depends sensitively on the par- 
ticular way that supersymmetry is broken. Thus by experimentally unravelling the pattern of the 
superpartner masses (i.e. squarks, sleptons, gauginos etc), one can throw light on the nature and 
the origin of supersymmetry breaking. 

The usual strategy employed in implementing supersymmetry breaking is to assume that SUSY 
is broken in a hidden sector that does not involve any of the matter or forces of the standard model 
(which we call the visible sector) and this SUSY breaking is transmitted to the visible sector via 
some intermediary , to be called the messenger sector. 

There are generally two ways to set up the hidden sector- a less ambitious one where one writes 
an effective Lagrangian (or superspotential) in terms of a certain set of hidden sector fields that 
lead to supersymmetry breaking in the ground state and another more ambitious one where the 
SUSY breaking arises from the dynamics of the hidden sector interactions. In implementing the 
second method, one uses the fact that in nonabelian N=1 gauge models with matter, the composite 
field theory can be written down exactly in many cases, as discussed above and if the resulting 
field theory has a vacuum that breaks supersymmetry, that is taken as a prototype hidden sector. 

For our purpose we will use the simpler schemes of the first kind. As far as the messenger 
sector goes there are three possibilities as already referred to earlier: (i) gravity mediated [11]; (ii) 
gauge mediated [12] and (hi) anomalous U(l) mediated[13]. Below we give examples of each class. 

This sector is made somewhat more technical than the rest of the article on purpose so that 
any one “wishing to get his/her feet wet” in the supersymmetry game can actually proceed to that 
endeavour right after reading this article. 

Vila, Gravity mediated SUSY breaking 

The scenario that uses gravity to transmit the supersymmetry breaking is one of the earliest 
hidden sector scenarios for SUSY breaking and forms much of the basis for the discussion in 
current supersymmetry phenomenology. In order to discuss these models one needs to know the 
supergravity couplings to matter. An essential feature of supergravity coupling is the generalized 
kinetic energy term in gravity coupled theories called the Kahler potential, K. We will use a 
function G of the Kahler potential. It is a hermitean operator and a function of the matter fields 
in the theory and their complex conjugates. The effect of supergravity coupling in the matter and 
the gauge sector of the theory is given in terms of G and its derivatives as follows: 

L{z) = \d^z\^ + e-°[G,G,, GJ^. + 3] (16) 

where z is the bosonic component of a typical chiral field (e.g. we would have z = qj etc) and 
G = 3ln(=^) - ln\W{z)\^. A superscript implies derivative with respect to that field. The 

simplest choice for the Kahler potential K is if = -3e that normalizes the kinetic energy 
term properly. Using this, one can write the effective potential for supergravity coupled theories 
to be: 

U(z, z*) = [| W, + ^Wf- ^ 

The gravitino mass is given interms of the Kahler potential as : 

m3/2 = (18) 

A popular scenario suggested by Polon3d is based upon the following hidden sector consisting of a 
gauge singlet field, denoted by z and the superpotential Wh given by: 

Wa = p\z + /3) 


(19) 



532 


R N Mohapatra 


where /x and (3 are mass parameters to be fixed by various physical considerations. It is clear 
that this superpotential leads to an F-term that is always non-vanishing and therefore breaks 
supersymmetry. Requiring the cosmological constant to vanish fixes /3 = (2 - y/i)Mpi. Given this 
potential and the choice of the Kahler potential as discussed earlier, supergravity calculus predicts 
a universal soft breaking parameters m given by mo ~ / M pi . Requiring mo to be in the TeV 
range implies that p ^ 10^^ GeV. The complete potential to zeroth order in Mp^ in this model is 
given by: 

V{M = [^a\^f+VD] (20) 

0(j>a 

+[mlY.aK<t>a + + h.c) 

where denote the dimension three and two terms in the superpotentiaJ respectively. The 

values of the parameters A and B at Mpi are related to each other in this example as S = A - 1. 
The gaugino masses in these models arise out of a separate term in the Lagrangian depending on 
a new function f{z) of the hidden sector singlet fields, js: 

I d:^x<f9f{z)W^Wx,c. ( 21 ) 

2 

If we choose f{z) = then gaugino masses come out to be order 77 x 3/2 

order mo , i.e. the electroweak scale. Furthermore, in order to avoid undesirable color and electric 

charge breaking by the SUSY models, one must require that ttxq > 0. 

It is important to point out that the superHiggs mechanism operates at the Planck scale. 
Therefore all parameters derived at the tree level of this model need to be extrapolated to the 
electroweak scale. So after the soft-breaking LagraagiEin is extrapolated to the weak scale, it will 
look like: 


__ -f mJlij^kAijk<(>i(pj<t^k "h (22) 

We will see later that this extrapolation arising from the radiative corrections to the theory even- 
tually leads to an understanding of the origin of the weak scale. 

Vllb. Gauge mediated SUSY breaking[12] 

This mechanism for the SUSY breaking has recently been quite popular in the literature and 
involve different hidden as well as messenger sectors. In particular, it proposes to use the known 
gauge forces as the messengers of supersymmetry breaking. As an example, consider a unified 
hidden messenger sector toy model of the following kind, consisting of the fields $ 1,2 and # 1,2 
which have the standard model gauge quantum numbers and a singlet field S and with the following 
superpotential: 


W = XS(M^ - #i#i) + Mi(#i$2 + #i#2 ) + (23) 

The F-terms of this model are given by: 


F5 = A(M2-fi$i) (24) 

= Mi#i; = Mi#i 

It is easy to see from the above equation that for Mi > Mo,M 2 , the minimum of the potential 
corresponds to all #’s having zero vev and Fs = AMq , thus breaking supersymmetry. The same 
superpotential responsible for SUSY breaking also transmits the SUSY breaking information to 
the visible sector. While the spirit of this model is similar to the original papers on the subject 
this unified construction is different and has its characteristic predictions. 



Supersymmetry and Particle Physics 


533 


The SUSY breaking is transmitted to the visible sector via one and two loop diagrams. The 
gaugino masses arise from the one loop diagram where a gaugino decomposes into the SUSY 
partners <^i and 0 and the loop is completed as (/>i and (^i mix thru Fs susy breaking term and the 
fermionic partners mix via the mass term M 2 . The squark and slepton masses arise from the two 
loop diagram where the squark-squark gauge boson -gauge boson coupling begins the first loop and 
one of the gauge bosos couples to the two (/>ps and another to the two which in turn mix via 
the F-terms for S to complete the two loop diagram. This is only one typical diagram and there 
are many more which contribute in the same order. It is then easy to see that their magnitudes 
are given by: 


a < Fs > 



(25) 


The first point to notice is that the gaugino and squark masses are roughly of the same order 
and requiring the squark masses to be around 100 GeV, we get for Fs/M^ ~ 100 TeV. Of course, 
< Fs> and M 2 need not be of same order in which case the numerics will be different. 

A distinguishing feature of this approach is that due to low scale for SUSY breaking, the 
gravitino mass is always in the milli-eV to kilo-eV range and is therefore is always the LSP. Thus 
these models cannot provide a candidate for the cold dark matter of the universe, which was always 
considered an added attraction of supersymmetric models. The attractive property of these models 
is that they lead naturally to near degeneracy of the squark and sleptons thus alleviating the FCNC 
problem of the MSSM and have therefore been the focus of intense scrutiny during the past year. 

These class of models however suffer from the fact that the messenger sector is too adhoc and 
practically an arbitrary number of models can be constructed just by varying the messenger sector. 
Nevertheless it is a very interesting class of models and should be tested experimentally. 


VIIc. Anomalous U(1 ) mediated supersymmetry breaking 

These class of models owe their origin to the string models, which after compactification can 
often leave anomalous U(l) gauge groups[14]. Since the original string model is anomaly free, the 
anomaly cancellation must take place via the Green-Schwarz mechanism as follows. Consider a 
U(l) gauge theory with a single chiral fermion that carries a U(l) quantum number. This theory 
has an anomaly. Therefore, under a gauge transformation, the low energy Lagrangian is not 
invariant and changes as: 

L-^L + ^FF (26) 

47r 

where F^u = and F is the dual of F^y. The last term is the anomaly term. To 

restore gauge invariance, we can add to the Lagrangian the Green-Schwaxz term and rewrite the 
effective Lagrangian as 

L' = L+^FF (27) 

M 

where under the gauge transformation a- Ma/An. In order to obtain the supersymmetric 
version of the Green-Schwarz term, we have to add a dilaton term to the axion a to make a complex 
chiral superfield. Let us denote the dilaton field by (j) and the complex chiral field containing it as 
S = <f) + ia. The gauge invariant action containing the S and the gauge supersfield V has terms of 

the following form: 

A = j (^dln{S + - V) + j (fOSW’^Wa + matter field parts (28) 

It is clear that in order to get a gauge field Lagrangian out of this, the dialton S must have a 
vev with the identification that < S >= 9~'^ and it is a fundamental unanswered question in 



534 


R N Mohapatra 


superstring theory as to how this vev arises. If we assume that this vev has been generated, then, 
one can see that the first term in the Lagrangian when expanded around the dilaton vev, leads 
to a term which is nothing but a linear Fayet-Illiopoulos D-term. Combining this 

with other matter field terms with non-zero U(l) charge, one can then write the D-term of the 
T.agrangian As an example that can lead to reahstic model building, we take two fields with equal 
and opposite U(l) charges ±1 in addition to the squark and slepton fields. The D-term can then 
be written as: 

Vd = + n\\L^ + + C)^ (29) 

This term when minimized does not break supersymmetry. However, if we add the superpotential 
a term of the form = m4>+<j>--, then there is another term in low energy effective potential that 
leads to the combined potential as: 

V = VD+m'^{\<i>+\'^ + \4>-\^) (30) 

The minimum of this potential corresponds to: 

< (f>+ >= 0; < (p- >= (C - ■ ■FV+ = mMpic (31) 

where we have assumed that ( = This then leads to nonzero squark masses cri nqrn?. 

Thus supersymmetry is broken and superpartners pick up mass. In the simplest model it turns 
out that the gaugino masses may be too low and one must seek ways around this. However, the 
A and B-terms are also likely to be small in this model and that may provide certain advantages. 
On the whole, this approach has great potential for model building and has not been thoroughly 
exploited[13]- for instance, it can be used to solve the FCNC problems, SUSY CP problem, to 
study the fermion mass hierarchies etc. It is beyond the scope of this review to enter into those 
areas. 

8 Supersymmetric standard model 

We will now apply the discussions of the previous sections to constuct the supersymmetric extension 
of the standard model so that the goal of stabilizing the Higgs mass is indeed realized in practice. 
Before we do that let us briefly summarize the salient features of the successful standard model. 

The standard model of electroweak and strong interactions based on the local symmetry group 
SU{Z)c X SU{2)i X U{1)y is one of the major milestones of twentieth century theoretical physics. 
It not only provides a complete description of all known phenomena in particle physics in terms of 
a renormalizable field theory but it reaffirms the crucial role played by the local symmetry in the 
domain of weak and strong interactions and extends its success from Quantum Electrodynamics 
to Quantum Flavor Dynamics for weak interactions and Quantum Chromodynamics for strong 
interactions. The field content of the standard model is given in table I for the sake of completeness 
of this review. 

Table I. The particle content of the standard model. 


Field 

gauge transformation 

Quarks Ql 

Righthanded up quarks ur 
Righthanded down quarks cIr 
Lefthanded Leptons L 
Righthanded leptons Cr 
Higgs Boson H 

Color Gauge Fields Ga 
Weak Gauge Fields 7 

(3,2, f) 

(3,1,|) 

(3. 1,-1) 
(1,2-1) 

(1,1, -2) 

(1, 2, -1-1) 

(8,1,0) 

(1,3-11,0) 



Supersymmetry and Particle Physics 


535 


After spontaneous symmetry breaking, the and the Z field acquire mass as do the fermions. 
The form of the neutral current interactions as well as the quantum corrections to this theory have 
been confirmed. The only particle of the standard model yet to be discovered is the Higgs boson and 
supersymmetry provides one class of extensions of the model where the Higgs boson has definite 
and testable properties. 

There are however many unsolved problems in the standard model that for a long time had 
made many people suspect that there is an arena (or many arenas) of rich new physics beyond 
the standard model that would not only solve the puzzles of the standard model but also provide 
deeper insight into the prospects for unification of matter and forces. 

The two puzzles that supersymmetry addresses successfully are the so called Higgs mass problem 
discussed already and the problem of understanding the gauge symmetry 'breaking. To discuss this, 
we start with a presentation of the particle contents of the model in detail as given in the table 
below. 

Table II caption. The particle content of the supersymmetric standard model. For matter and 
Higgs fields, we have shown the left-chiral fields only. The right-chiral fields will have a conjugate 
representation under the gauge group. 


Superfield 

Particles 

Superpartners 

gauge 

transformation 

Quarks Q 
Antiquarks 
Antiquarks 
Leptons L 
Antileptons 

Higgs Boson Hu 
Higgs Boson Hd 
Color Gauge Fields 
Weak Gauge Fields 
Photon 

iu,d) 

vF 

df 

(y,e) 

{HIHJ) 

Ga 

w^, z 

7 

(u, d) 

■1? 

(HIHJ) 

Ga 

W^,Z 

7 

(3. 2. 1) 

(3M,|) 

(1,2-1) 

(1.1.2) 
(1,2,+1) 
(1,2,-1) 

(8,1,0) 


First note that an important difference between the standard model and its supersymmetric 
version apart from the presence of the superpartners is the presence of a second Higgs doublet. 
This is required both to give masses to quarks and leptons as well as to make the model anomaly 
free. The gauge interaction part of the model is easily written down following the rules laid out in 
[2]. In the weak eigenstate basis, weak interaction Lagrangian for the quarks and leptons is exactly 
the same as in the standard model. As far as the weak interactions of the squarks and the sleptons 
is concerned, the generation mixing angles are very different from those in the corresponding 
fermion sector due to supersymmetry breaking. This has the phenomenological implication that 
the gaugino-fermion-sfermion interaction changes generation leading to potentially large flavor 
changing neutral current effects such as mixing, p, —¥ e'j decay etc unless the sfermion 

masses of different generations are chosen to be very close in mass. 

Let us now proceed to a discussion of the superpotential of the model. It consists of two parts: 

W=^Wi+W2, (32) 

where 

Wi = + (33) 

W2 = XijkLiLjCl + Kjk^iQj^k i 

iyj^k being generation indices. We first note that the terms in Wi conserve baryon and lepton 
number whereas those in W 2 do not. The latter are known as the i^-parity breaking terms where 
i?-parity is defined as 


R = (_l)3(B-i)+2S ^ 


(34) 




536 


R N Mohapatra 


where S is the spin of the particle. It is interesting to note that the i?-parity symmetry defined 
above assigns even i?-parity to known particles of the standard model and odd i?-parity to their 
superpartners. This has the important experimental implication that for theories that conserve 
jR-parity, the super-partners of the particles of the standard model must always be produced in 
pairs and the lightest superpartner must be a stable particle. This is generally called the LSP. If 
the LSP turns out to be neutral, it can be thought of as the dark matter particle of the universe. 
In the presence of R-parity breaking interactions however, the LSP decays and the model has no 
cold dark matter candidate. 

We now embed this model into the minimal iV = 1 supergravity model with a Polonyi type 
hidden sector. As a result, we get the mass splitting for the squarks and sleptons from the quarks 
and the leptons. We also get trilinear scalar interactions amomg the sfermions as follows: 

= m2/2[Ae,abe^aLbHd + Ad^abQ aHddPb + Au^abQ aHuU^b] (35) 

i=scalars cl 

There will also be the corresponding terms involving the R-parity breaking interactions, which we 
omit here for simplicity. 

The solution of the Higgs mass problem in the MSSM comes about as follows: if its tree level 
value is chosen to be of the order of the electroweak scale, any radiative correction to it will 
only induce terms of order choosing the supersymmetry breaking scale in the 

TeV range as we did above, we can guarantee that to all orders in perturbation theory the Higgs 
mass remains stable and near the weak scale. Crucial to this result is the important property of 
nonrenormalization theorem[5] discussed earlier. 

The way supersymmetry solves the problem of weak symmetry breaking is as follows. In order 
to have a realistic model, first a mechanism for supersymmetry breaking following one of the three 
ways already described is introduced. As is clear from the discussion of supersymmetry breaking, 
the hidden sector supersymmetry breaking manifests itself as a positive (mass)^’s for all scalar 
fields at the scale of SUSY breaking. (Some or all of them may also be equal depending on other 
details.) In order to study the theory at the weak scale, one must extrapolate all these parameters 
using the renormalization group equations. The degree of extrapolation will of course depend 
on the strength of the gauge and the Yukawa couplings of the various fields. In particular, the 
will have a strong extrapolation proportional to since Hu couples to the top quark. 
Since h* 1, this can make Tnjj^{Mz) < 0, leading to spontaneous breakdown of the electroweak 
symmetry. An approximate solution of the renormalization group equations gives 

< (Mri = mi. iAs^sr) - (36) 

This is a very attractive feature of supersymmetric theories. 

Once the symmetry breaking has been implemented, constraints of supersymmetry provide one 
prediction that can distinguishes the MSSM from the standard model- i.e. the mass of the lightest 
Higgs boson. It can be shown that the lightest higgs boson mass-square is going to be of order 
~ In fact denoting the vev’s of the two Higgs doublets as < >= Vu and < >= Vd, 

one can write: 


(37) 

Defining Vu/vd = tanfi, we can rewrite the above light Higgs mass formula as m\ = Mlcos2l3 
which implies that the tree level mass of the lightest Higgs boson is less than the Z mass. Once 
radiative corrections are taken into account, rrih increases above the M^. However, it is now well 
established that in a large class of supersymmetric models (which do not differ too much from the 
MSSM), the Higgs mass is less than 130 GeV or so. 



Supersymmetry and Particle Physics 


537 


9 Why go beyond the MSSM ? 

Even though the MSSM solves two outstanding peoblems of the standard model, i.e. the stabi- 
lization of the Higgs mass and the breaking of the electroweak symmetry, it brings in a lot of 
undesirable consequences. They are: 

(a) Presence of arbitrary baryon and lepton number violating couplings i.e. the A, and 
X" couplings described above. In fact a combination of A^ and couplings lead to proton decay. 
Present lower limits on the proton lifetime then imply that A'A^' < 10~^^ for squark masses of order 
of a Te V . Recall that a very attractive feature of the standard model is the automatic conservation 
of baryon and lepton number. In this sense therefore MSSM takes us one step backward from the 
standard model. The presence of R-parity breaking terms[15] also makes it impossible to use the 
LSP as the Cold Dark Matter of the universe since it is not stable and will therefore decay away 
in the very early moments of the universe. In various grand unified theories, keeping the R-parity 
violating terms under control provides a major constraint on model building. 

(b) The different mixing matrices in the quark and squark sector leads to arbitrary amount of 
flavor violation manifesting in such phenomena as Kl - Ks mass difference etc. Using present 
experimental information and the fact that the standard model more or less accounts' for the 
observed magnitude of these processes implies that there must be strong constraints on the mass 
splittings among squarks. Detailed calculations indicate that one must have Am|/m| < 10"^ or 
so. Again recall that this undoes another nice feature of the standard model where understanding 
the suppression of flavor violation was much simpler. 

(c) The presence of new couplings involving the super partners allows for the existence of extra 
CP phases. In particular the presence of the phase in the gluino mass leads to a large electric 
dipole moment of the neutron unless this phase is assumed to be suppressed by tw^o to three orders 
of magnitude. This is generally referred to in the literature as the SUSY CP rpobiem. In addition, 
there is of course the famous strong CP problem which neither the standard model nor the MSSM 
provide a solution to. 

(d) Finally there is a new naturalness problem that arises in the MSSM having to do with the 
HuHd term in the superpotential (see Eq. (33))- the so called term. The point is that adequate 
electroweak symmetry breaking requires that this be of order of a few hundred GeV at most. But 
since this parameter is allowed in the supersymmetry limit, there is no reason for it to be smaller 
than say Planck mass. Although there no divergent corrections to this parameter and in that sense 
it is different from the Higgs mass conundrum of the standard model, nonetheless the fact that we 
have to choose it to be small when apriori it could have been huge is not very satisfactory. It is 
hoped that in future theories that go beyong the MSSM this problem, known as the fj. problem 
will be addressed. 

In order to cure these problems, one must seek new physics beyond the MSSM. Below, we 
discuss the example of the supersymmetric left-right model, which leads to automatic B and L 
conservation as well as solves the SUSY CP problem. These models also provide a solution to the 
strong CP problem without the need for an axion under certain circumstances. 

10 Supersymmetric Left-Right model 

The gauge group for this model[16] is SU{2)l x 517 (2)^ x U{1)b-l ^ SU{Z)c- The chiral su- 
perfields denoting left-handed and right-handed quark superfields are denoted by <5 = {u,d) 
and respectively and similarly the lepton superfields are given by T = (v,e) and 

L® = (i/‘=,e‘=)[17]. The Q and L transform as left-handed doublets with the obvious values for the 
J5 - i and the and L‘ transform as the right-handed doublets with opposite B-L values. The 
symmetry breaking is achieved by the following set of Higgs superfields: ^ 0 ( 2 , 2,0, 1) (a = 1,2), 
A(3,l, -1-2,1); A(3,l,-2,1); A'(l,3,-2,1) and A°(l, 3,-l-2, 1). Unlike in the MSSM, the allowed 
terms in the superpotential are very limited in this case: 

W = haQ(i>aQ‘' + KL<f>aL‘=-i-f{LLA + L'=L‘=A'^) 

+IXab'^'’'{<Pa4'b) + M(AA -I- A'^A”) 


( 38 ) 



538 


R N Mohapatra 


It is clear from the above equation that this theory has no baryon or lepton number violating terms 
and it allows for a dark matter particle. 

The next question is how one breaks the SU{2)r symmetry so that the successes of the standard 
model including the observed predominant V-A structure of weak interactions at low energies 
is reproduced. Another question of naturalness that also arises simultaneously is that since the 
charged fermions and the neutrinos are treated completely symmetrically (quark-lepton symmetry) 
in this model, how does one understand the smallness of the neutrino masses compared to the other 
fermion masses. 

It turns out that both the above problems of the LR model have a common solution. The 
process of spontaneous breaking of the SU{2)r symmetry that suppresses the V+A currents at 
low energies also solves the problem of ultralight neutrino masses. To see this let us specify the 
Higgs sector of the model: we choose SU(2) triplets with R - L = 2 in left-right symmetric pairs 
(Ai 0 Ar) and bidoublets {((>) with B - L = 0 which are left-right symmetric generalizations of 
the standard model Higgs doublet. The various components of the above Higgs fields is given as: 


A = 


A+ /V2 A++ \ ^ , 

A^ -A+/\/2 ) ’ ^ 


4>t \ 
<f>T <l>°2 J 


(39) 


The SU{2)r X I7(1)b~l symmetry is broken by giving a large vev to the A^’^ field (i.e. < A^’*^ >= 
vr. If > mwLi then the V-f A current effects at low energies are suppressed compared to the 
observed V-A current effects observed in beta decay. At the same time the L^L^A^ coupling in 
the superpotential for the left-right model leads to a mass for the righthanded neutrino which is 
given by m/v = fvR- The lefthanded neutrinos at this stage are massless. In the two component 
notation for the neutrino, this leads to the following mass matrix for the z/,iV (where we have 
denoted the left handed neutrino by u and the right handed component by N). 



By diagonalizing this 2x2 matrix, we get the light neutrino eigen value to be rrij^ ci Note 

that typical charged fermion masses are given by h*K etc. So since vr > the light neutrino 
mass is automatically suppressed. This way of suppressing the neutrino masses is called the seesaw 
mechanism [18]. Thus in one stroke, one explains the smallness of the neutrino mass as well as the 
suppression of the V-hA currents. 

Another attractive feature of the LR models is that constraints of parity symmetry (under 
which Ql Qrj (j) <t>^ etc), require the Yukawa couplings ha^h'^ to be hermitean. As a result, 

if the vacuum of the theory is such that the < ^> vevs are real, then the quark mass matrices are 
heimitean. This means that the, strong CP parameter 0 which is equal to Arg{DetMq) vanishes 
at the tree level. This point was emphasized in the late 70’s’as a possible way to solve the strong 
CP problem without introducing the axion. The only problem that was noted with this suggestion 
at that time was that without imposing extra symmetries, it was not possible to have the < 0 > to 
be real. This problem gets cured automatically once the left-right model is made supersymmetric. 

Furthermore parity symmetry also makes the gluino mass, /x and Bp terms all real unlike the 
situation in MSSM. This solves the so called supersymmetric CP problem of the MSSM. 

This model however does not throw any light on either the p problem or the FCNC problem. 
They will perhaps have a solution of supersymmetric origin. Thus our point of view at the moment 
is that the starting point for the search for physics beyond MSSM should start with the SUSYLR 
model as the first step. 


11 Grand unification and supersymmetry 

Soon after the discovery of the standard model, it became clear that embedding the model into 
higher local symmetries may lead to two very distinct conceptual advantages: (i) they may provide 
quark lepton unihcation [19] providing a unified understanding of the apriori separate interactions 



Supersymmetry and Particle Physics 


539 


of the two different types of matter and (ii) they can lead to description of different forces in 
terms of a single gauge coupling constant. How actually the unification of gauge couplings occurs 
was discussed in a seminal paper by Georgi, Quinn and Weinberg[20]. They used the already 
known fact that the coupling parameters in a theory depend on the mass scale and showed that 
the gauge couplings of the standard model can indeed unify at a very high scale of order 10^^ 
GeV or so. Although this scale might appear too far removed from the energy scales of interest in 
particle physics then, it was actually a blessing in disguise since in GUT theories, obliteration of 
the quark-lepton distinction manifests itself in the form of baryon instability such as proton decay 
and the rate of proton decay is inversely proportional to the 4th power of the grand unification 
scale and only for scales near 10^^ GeV or so, already known lower limits on proton life times 
could be reconciled with theory. This provided a new impetus for new experimental searches for 
proton decay. The minimal grand unification model based on the SU(5) group suggested by Georgi 
and Glashow made very precise prediction for the proton lifetime of Tp between 1.6 x 10^® yrs. to 
2.5 X 10^^ yrs. Attempts to observe proton decay at this level failed ruling out the simple minimal 
nonsupersymmetric SU(5) model. The primary decay mode of proton in the nonsupersymmetric 
SU(5) is p — e“^ + TT®, which now has a lower limit on its lifetime from the Super-Kamiokande 
experiment[21] of Tp > 1.6 x 10^^ years assuming 100% branching ratio. 

In fact the situation for nonsupersymmetric SU(5) was worse since it predicted a value for 
sin^Ow which is much lower than the experimentally observed one. This situation is depicted in 
Fig. 1 as a lack of unification of the gauge couplings using the experimental value of sin^9w at 
the weak scale. 

A revival of interest in the idea of grand unification occurred after supersymmetry became part 
of the phenomenology of particle physics in the early 80’s. Two points were realized that led to this. 
First point already emphasized is that a theoretical understanding of the large hierarchy between 
the weak scale and the GUT scale was possible only within the framework of supersymmetry as 
discussed earlier. Secondly, on a more phenomenological level, measured values of sin^ffw from 
the accelerators coupled with the observed values for otgtrong and Oem could be reconciled with the 
unification of gauge couplings only if the superpartners were included in the evolution of the gauge 
couplings and the supersymmetry breaking scale was assumed to be near the weak scale, which 
was independently motivated any way [22] (see Fig. 2). 

This has led to considerations of many grand unifed gauge groups, notable among them being 
models based on the groups SU{5) and 50(10) [23]. At present, this is an active area of research 
in particle theory. We do not enter into the details of this area except to remark that the discovery 
of a nonzero mass has put the SO (10) models at a clear advantage over the SU(5) one. 


12 Strongly coupled supersymmetric gauge theories and com- 
posite models of quarks and leptons 

As mentioned earlier in this article, supersymmetry has helped improve our understanding of 
the nonperturbative dynamics of nonabelian gauge theories. In particular, there has developed a 
paradigm concerning the nature and interaction of the low energy composite states in such models. 
This has revived the hope that earlier ideas concerning the possible substructure of quarks and 
leptons may perhaps be given a more solid field theoretical foundation and new understanding of 
quark lepton physics may be gleamed. It ought to be emphasized that while the above paradigm 
cannot be rigorously proved, it passes an impressive number of consistency tests far beyond the 
’t Hooft’s anomaly matching condition that was the main dynamical input of the original set of 
composite models[24]. 

To illustrate the general procedure adopted in this approach. Consider a model based on the 
gauge group SU{Nc) with Nf flavored preons F transforming as Nq and F^ as -dimensional 
representation representation of the group. If Nf > then the composite states are given by 

Me (plus the right chiral product). The former in the familiar QCD 
language are are meson-like states and the latter are baryon-like states. The effective superpotential 
for such a composite theory will be given by allowed SU{Nc) invariant products of the preonic 



540 


R N Mohapatra 


fields that lead to products of above states. Thus, the dynamics of the effective theory is really 
quite constrained. 

This paradigm has inspired a number of attempts to build composite models for quarks and 
leptons. However to date no compelling scenario has emerged, although the potential for model 
building remains good. 

13 Supersymmetry and string theory 

While in this review, we have stayed within the domain of field theories, supersymmetry plays also 
an essential role in the arena of string theories. To briefly describe this role, we note that string 
theories posit that at extremely small distances, the description of nature in terms of particles 
breaks down and the fundamental variables become strings embedded in spacetime. The strings 
were originally thought to have a size of order of the Planck length cm) although recent 

thinkings in the subject allows much larger length scales (even upto 10”^^ cm). 

Most of the interesting results in string theories derive from the property that these theories 
have conformal invariance which has as its subgroups scale invariance and the Lorentz invariance. 
Since string theories are defined in two dimensions (known as world sheet), the conformal invariance 
is infinite dimensional. This puts enormous number of constraints on string theories. One of the 
first ones is that the theory is defined only in 26 dimensions if it is pure bosonic. Also it has a 
tachyonic degree of freedom. 

To connect it to particle physics it is postulated that the vibrational modes of the strings are to 
be identified with particles and the resulting Lagrangian should follow as a result of maintaining 
conformal invariance (i.e. setting Beta- functions to zero). It is clear that for a bosonic theory the 
vibrations can only be bosonic and thus cannot be realistic descriptions of nature. To generate 
fermions, one incorporates fermionic string degrees of freedom, same number as the bosonic degrees. 
( This was where supersymmetry was first discovered in the context of particle physics.) To get 
fermionic particles (i.e. string states transforming as spinors under the Lorentz group), one has 
to impose the so called ramond boundary conditions on the fermionic strings. In any case the 
restrictions of conformal symmetry are now altered and one can have a situation with no tachyons 
and d=10. Further excitement in string theories came due to the discovery that the closed strings 
contain massless spin two states that can be identified with the graviton. 

What is important for our discussion here is that string models provide what can be called 
a “derivation” of supersymmetry as a part of field theories that result in the low energy limit of 
field theories. Maintaing conformal invariance of string theories at the one loop level automatically 
require that the string theories lead to a particle spectrum which is supersymmetric. What is more: 
the string theories indeed lead to local supersymmetry. Also they have all the ingredients to lead 
to a breakdown of supersymmetry although this has not yet been demonstrated and is part of the 
general program of determining the correct vacuum of the string theories. 

14 Conclusion 

The advent of supersymmetry has opened up a new era of understanding and progress in both field 
theories and particle physics. Although so far experimentally no evidence has appeared for new 
particles that would signal the reality of these ideas, its impact in terms of better understanding 
of many aspects of physics has been so overwhelming that it has won many converts and it will 
indeed be a cruel irony if after so much promise, nature reveals its disapproval of this new kind of 
symmetry. 

We have only scratched the tip of a whole mountain that supersymmetry is. Within the arena 
of field theories with supersymmetry, we have only discussed only N=1 supersymmetry whereas 
there are many interesting constraints on field theories that emerge once we enlarge the number 
of supersymmetries to 2 to the maximum number of 8. For instance, the divergence properties 
of field theories keep improving as we go firom N=1 to N=2 and finally to N=4 where the field 
theory is a finite field theory. The improvement in N=2 case is that there are infinities only 



Supersymmetry and Particle Physics 


541 


from the gauge sector but not from the Yukawa sector. For the N=2 case, there is of course an 
exact solution discovered by Seiberg and Witten that threww new light on many of the issues 
in string theory as well supersymmetric field theories. Another direction in which developements 
have taken place is in going to higher dimensions while keeping N=l. It is known in this case 
that if we want to keep field theories with spin < 2, then the maximum allow^ed dimension is 
d=ll. This 11-dimensional supergravity thought for a long time to be of mere academic interest 
has been found to be the low energy limit of certain string theories in the strong coupling limit 
and is poised to open up some very new ways to look the question of unification- for instance 
there may be unification of all interactions including gravity at one scale, the cannonical scale of 
grand unification of 2 x 10^^ GeV. Another very interesting supersymmetric theory is the N=1 
10-dimensional super- Yang-Mills theory coupled to gravity. This theory is known to arise in the 
low energy limit of the so called heterotic string theories and forms the basis of much of what is 
known as superstring pheenomenology. 

This work was supported by the National Science Foundation grant no. PHY-9802o51. 


References 

[1] J Bagger, J Wess: Supersymmetry and Supergravity, Princeton University Press (1983). 

[2] R N Mohapatra: Unification and Supersymmetry, Springer- Verlag, Second edition (1991). 

[3] H Haber, G Kane: Phys. Rep. 117, 76 (1984); R. Arnowitt, A. Chamsheddine and P. Nath, 
Applied N=1 Supergravity, (World Scientific, 1984). 

[4] H P Nilles: Phys. Rep. 110, 1 (1984). 

[5] M Grisaru, M Rocek, W Siegel: Nucl. Phys. B159, 429 (1979). 

[6] For a review of the recent developments, see K. Intrilligator and N. Seiberg, hep-th/9509066 
Proceedings of TASI95, ed. K. T. Mahanthappa et al (World Scientific, 1995). 

[7] A. Davis, M. Dine and N. Seiberg, Phys. Lett. 125 B, 487 (1983). 

[8] A. Nelson and M. Strassler, hep-ph/9607362; M. Luty and R. N. Mohapatra, Phys. Lett. 
B396, 161 (1997) N. Haba and N. Okamura, DPNU-97-27 (1997). 

[9] For a review and references, see P. van Nieuwenhuizen, Phys. Rep. 68, 189 (1981). 

[10] Supersymmetry-96: Theoretical Perspectives and Experimental Outlook, ed. R N Mohapatra 
and A Rasin, North Holland (1997). 

[11] R Arnowitt, A Chamsheddine, P Nath, Phys. Rev. Lett. 49, 970 (1982); R Barbieri, S Ferrara, 
C Savoy, Phys. Lett. B119, 343 (1982). 

[12] M Dine, A Nelson: D 48, 1277 (1993); M. Dine, A.E. Nelson, Y. Nir, and Y. Shirman, hep- 
ph/9507378; Phys. Rev. D53 (1996) 2658; A.E. Nelson, hep-ph/9511218; M. Dine and A.E. 
Nelson, Phys. Rev. D48, 1277 (1993); M. Dine, A.E. Nelson, and Y. Shirman, Phys. Rev. 
D51, 1362 (1995). 

[13] P Binetruy, E Dudas, Phys. Lett. B389, 503 (1996); G Dvali, A Pomarol, Phys. Rev. lett. 
77, 3728 (1996); R N Mohapatra, A Riotto, Phys. Rev. D55, 4262 (1997). 

[14] M. Dine, N. Seiberg and E. Witten, Nucl. Phys. B289, 585 (1987); J. Attick, L. Dixon and 
A. Sen, t&idB292, 109 (1987). 

[15] C. S. Aulakh and R. N. Mohapatra, Phys. Lett. 119B, 36 (1982); L. Hall and M. Suzuki, Nucl. 
Phys. B231, 419 (1984); V Barger, G F Giudice, M Y Han; Phys. Rev. 40, 2987 (1989); For 
a recent review, see G Bhattacharyya, Proceedings of SUSY’96, Nucl. Phys. (Proc. Suppl.), 
52A, 83 (1996). 



542 


R N Mohapatra 


[16] J. C. Pati and A. Salam, Phys. Rev. DIO, 275 (1974); R. N. Mohapatra and J. C. Pati, Phys. 
Rev. D 11, 566, 2558 (1975); G. Senjanovic and R. N. Mohapatra, Phys. Rev. D 12, 1502 
(1975). 

[17] R. Kuchimanchi and R. N. Mohapatra, Phys. Rev. D48, 4352 (1993); Phys, Rev. Lett. 75, 
3989 (1995); C. Aulakh, A. Melfo and G. Senjanovic, hep-ph/9707258; Z. Chacko and R. N. 
Mohapatra, Phys. Rev. D 58, 015001 (1998); C. Aulakh, K. Benakli and G. Senjanovic, Phys. 
Rev. Lett.rO, 2188 (1997). 

[18] M. Gell-Mann, P. Rammond and R. Slansky, in Supergravity, eds. D. Freedman, al (North- 
Holland, Amsterdam, 1980); T. Yanagida, in Proc. KEK workshop, 1979 (unpublished); R.N. 
Mohapatra and G. Senjanovic, Phys. Rev. Lett. 44, 912 (1980). 

[19] J. C. Pati and A. Salam, Phys. Rev. DIO, 275 (1974); H. Georgi and S. L. Glashow, Phys. 
Rev. Lett. 32, 438 (1974). 

[20] H. Georgi, H. Quinn and S. Weinberg, Phys. Rev. Lett. 33, 451 (1974). 

[21] M. Shiozawa et al. (Super-Kamiokande collaboration) ICRR-Report~419-98-15. 

[22] W. Marciano and G. Senjanovic, Phys. Rev. D25, 3092 (1982); U. Amaldi, W. de Boer and 
H. Furstenau, Phys. Lett. B260,447 (1991); P. Langacker and M. Luo, Phys. Rev. D44, 817 
(1991); J. Ellis, S. Kelly and D. Nanopoulos, Phys. Lett. B260, 131 (1991). 

[23] For a recent review, see R. N. Mohapatra, TASI97-Lectures on Supersymmetry, edited by J. 
Bagger (World Scientific, 1998), 

[24] J. C. Pati and A. Salam, Phys. Rev. DIO, 275 (1973); Y. Chikashige, H. Akama and H. 
Terazawa, Phys. Rev. D 15, 480 (1977); 0. W. Greenberg and J. Sucher, Phys. Lett. B99, 
339 (1981); W. Buchmuller, R. D. Peccei and T. Yanagida, Phys. Lett. B124, 67 (1983); 0. 
W. Greenberg, R. N. Mohapatra and M. Yasue, Phys. Rev. Lett. 51, 1737 (1983); R. Barbieri, 
A. Masiero and G. veneziano, Phys. Lett. B 128, 179 (1983). 



Supersymmetry and Particle Physics 


543 



Figure 1: This figure shows the lack of unification of gauge couplings with standard model spectrum. 

is plotted against the mass scale and the values at the weak scale are the measures values from 
LEP and SLC as well as other experiments. This is another way of stating the wrong prediction 
of sin^2dw 



Figure 2: This figure shows the unification of gauge couplings with supersymmetric model spec- 
trum. is plotted against the mass scale and the values at the weak scale are the measures 
values from LEP and SLC as well as other experiments. 





21. Supersymmetry in Field Theory 


Norisuke Sakai * 

Department of Physics, Tokyo Institute of Technology 

Oh-okayama, Meguro, Tokyo 152-8551, Japan 


Abstract 

Supersymmetric theories are reviewed in the context of field theories. The gauge hier- 
archy problem in attempting the unification of all fundamental interactions is the strongest 
motivation of modern development of supersymmetry. Starting from the general notion of su- 
persymmetry as a symmetry between bosons and fermions, we explain how the supersymmetry 
becomes a part of the space-time symmetry if we wish to maintain the relativisitic invariance. 
The precise idea of supersymmetry is then introduced and the supersymmetric field theories 
are formulated. There has been a significant breakthrough in the study of nonperturbative 
effects in supersymmetric field theories using the holomorphy and symmetry arguments. Some 
of these ideas and results are briefly reviewed. 


1 Motivations for Supersymmetry 

1.1 Gauge Hierarchy 

1- Standard model 

Many efforts have been devoted to study the fundamental constituents of matter and the 
fundamental interactions between them. At present, the experimental efforts have reached 
the energy scales of several lOOOGeV in collisions between protons and/or antiprotons, and 
that of a few lOOGeV in collisions between electrons and positrons. 

It has been found that all the available experimental data up to these energies can be more 
or less adequately described by the so-called standard model. In the standard model, the 
fundamental constituents of matter are quarks and leptons and the three known fundamen- 
tal forces in nature, strong, weak, and electromagnetic interactions are described by a gauge 
field theory with the 51/(3) x SU{2) x U{1) gauge group. The standard model succeeded 
to describe the three fundamental interactions by a common unifying idea called the gauge 
principle and gave many successful predictions. The most striking confirmation of stan- 
dard model is the discovery of the weak bosons, W and Z with the mass of the order of 
Mw ^ lOOGeV. However, there are three different gauge coupling constants for each of these 
gauge groups 5f/(3), SU(2), and U{1). In that sense, the three different strengths of the 
three fundamental interactions are parametrized nicely, but are not quite unified. Moreover, 
the standard model has many input parameters that can only be determined from the ex- 
perimental measurements. There axe also other conceptually unsatisfactory points as well. 
For instance, the electric charge is found to be quantized in nature, but this phenomenon is 
just an accident in the standard model. 

2. Grand Unified Theories 

Because of qunatum effects, the effective gauge coupling constants change logarithmically 
as a function of energy scale. Then there is a possibility that the different gauge couplings 


•E.mail: nsakai@th.phys.titech.ac.jp 



Supersymmetry in Field Theory 


545 


for the three fundamental interactions can become the same strength at very high energies 
Mq. This means that the three gauge interactions can be truly unified into a single gauge 
group if we choose an appropriate simple gauge group. This idea was proposed by Georgi 
and Glashow [1], and these models are called the grand unified theories. The grand unified 
theories achieved at least two good points: 

• Because of simple gauge group, the electromagnetic charge is now quantized. 

• Two coupling can meet at some point provided they are in the right direction. Since the 
grand unified theory unifies all three couplings at high energies, it gives one constraint 
for three couplings. Taking the two measurements of coupling constants at low energies 
as inputs, one can then predict the third coupling. With the simplest possibility for the 
unifying gauge group, this prediction was found to be not very far from the experimental 
data. On the other hand, the unification energy Mq is now very large compared to the 
mass scale Mw of the weak boson in the standard model [2] 


Mk 

Ml 



( 1 ) 


3. Gravity 

Even if one does not accept the grand unified theories, one is sure to accept the existence of the 
fourth fundamental force, the gravitational interactions. The mass scale of the gravitational 
interactions is given by the Planck mass Mpi 





( 2 ) 


Now we have a problem of how to explain these extremely small ratios between the mass 
squared to the fundamental mass squared Mq or Mpi in eq.(l) or eq.(2). This problem 
is called the gauge hierarchy problem. 


1.2 Higgs Scalar 

Precisely speaking, when we say explain some phenomenon, we mean that it should be given a 
symmetry reason. This principle is called the naturalness hypothesis [3], [4]. More precisely, the 
system should acquire higher symmetry as we let the small parameter going to zero. The examples 
of the enhanced symmetry corresponding to the small mass parameter are 

^j=i /2 “^0 ^ Chiral symmetry 

mj=i -> 0 Local gauge symmetry (3) 

The mass scale Mw of weak bosons originates from the vacuum expectation value v of the 
Higgs scalar field. The scale of v in turn comes from the (negative) mass squared of the Higgs 
scalar ip. Therefore we need to give symmetry reasons for the extremely small Higgs scalar mass 
to explain the gauge hierarchy problem. 

Classically the vanishing mass for scalar field does lead to an enhanced symmetry called scale 
invariance. However, it is well known that the scale invariance cannot be maintained quantum 
mechanically. 

Up to now three types of possible solutions have been proposed to explain the gauge hierarchy 
problem. 

1. Technicolor model 

We can postulate that there is no elementary Higgs scalar at all. The Higgs scalar in the 
standard model has to be provided as a composite field at low energies. This option requires 
nonperturbative physics already at energies of the order of TeV = 10^ GeV. It has been 



546 


N Sakai 


rather difficult to construct realistic models that pass all the test at low energies specially 
the absence of flavor-changing neutral current. Models with composite Higgs scalar are called 
Technicolor models [5]. 

2. Supersymmetry 

Another option is to postulate a symmetry between Higgs scalar and a spinor field. Then we 
can postulate chiral symmetry for the spinor field to make it massless. The Higgs scalar also 
becomes massless because of the symmetry between the scalar and the spinor. This symmetry 
between scalar and spinor is called supersymmetry [6]. Supersymmetry as a possible solution 
of gauge hierarchy problem was proposed concretely in the context of supersymmetric grand 
unified theories [7] [8] [9] [10], although the use of supersymmetry has been advocated for 
electroweak interactions earlier [11]. Contrary to the Technicolor models, we can construct 
supersymmetric models that can be treated perturbatively up to extremely high energies 
along the spirit of the grand unified theories [12], [13]. 

Experimental progress for the precise measurements of coupling constants enabled one to test 
the unification hypothesis precisely. More than 10 years after the initial proposal of super- 
symmetric grand unified theories, the experimental data from LEP nicely confirmed that the 
nonsupersymmetric model does not give unification at a single point, and the supersymmetric 
model gives an excellent agreement [14]. 

3. Large extra dimensions 

The most recent proposal was to note that the gravitational interactions are not tested at 
short distances below mm. Therefore one can consider the possibility of the fundamental 
scale of gravitational interactions of lOOOGeV. The observed smallness of the gravitational 
interaction in our world is explained by imagining the extra dimensions compactified at the 
radius of order mm or less [15]. The supersymmetry is not needed logically in this case, 
although it is often used to construct concrete models. 

1.3 Symmetry Relating Different Statistics and Spin 

1.3.1 Symmetry Relating Different Statistics 

Supersymmetry can be defined as a symmetry relating bosons and fermions. Namely particles with 
different statistics are related by the supersymmetry. 

There is no significant constraint in formulating such a supersymmetry in nonrelativistic quan- 
tum theories. In fact the supersymmetry has been useful in several areas of nonrelativistic quantum 
theory such as condensed matter physics and nuclear physics. Let us mention two interesting ap- 
plications: 

1. Solid State Physics 

If one considers a spin system in random magnetic fields, the randomness of the magnetic 
field tends to disorder the spin system. It has been found that the critical behavior of the 
spin system in random magnetic fields in d dimensions is the same as that of the spin system 
without the random magnetic fields in d—2 dimensions. This phenomenon is sometimes called 
dimensional reduction. Paris! and Sourlas gave a beautiful explanation of this phenomenon 
by uncovering the underlying supersymmetry of the spin system in the random magnetic 
fields [16]. 

2. Nuclear Physics 

In certain complex nuclei, it is quite useful to use supersymmetry among quasi particle 
excitations to classify various nuclear energy levels. 



Supersymmetry in Field Theory 


547 


1.3.2 Symmetry Relating Different Spins 

We are mainly interested in supersymmetry as a fundamental symmetry principle. We have two 
other fundamental principles in modern physics: quantum theory and relativity. In nature, ail 
bosons have integer spin and ail fermions have half-odd integer spin. This fact can be explained 
if we employ relativistic quantum field theory. Therefore supersymmetry inevitably becomes a 
symmetry between particles with different spin if we want to maintain relativistic invari- 
ance. Since the spin is a quantum number associated with the rotation, we need to formulate 
supersymmetry as a symmetry that is nontrivially combined with the space-time symmetry such 
as rotations, translations, and Lorentz transformations. 

It has been a notoriously difficult problem to formulate a nontrivial symmetry that relates 
particles with different spins. This point can be most neatly summarized by the so-called “No-go 
Theorem” by Coleman and Mandula [17] They assumed Lorentz invariance, analyticity of scattering 
amplitudes (corresponding to the causality), nontrivial S-matrix, and other technical assumptions. 
They found that Poincare group can only appear as a direct product group with other symmetry. 
Namely no nontrivial symmetry is possible between particles of different spins. In this No-go 
theorem, they have actually assumed that all the symmetry relations are expressed in terms of 
commutation relations. 

Much later, it has been found that nontrivial symmetry is possible if one uses anticommutation 
relations among symmetry generators instead of the ordinary commutation relations. With the 
same assumptions as those of Coleman and Mandula except the introduction of the anticommu- 
tation relation, Haag, Lopuszanski, and Sohnius were able to obtain the most general symmetry 
[18]. They have found that the supersymmetry as we know now is the only possible symmetry that 
involves space-time S3rmmetry nontrivially. We will describe this supersymmetry in subsequent 
sections. 


2 Basic Concepts in Supersymmetric Field Theory 

2.1 Superfield and Supertransformation 

To formulate symmetry such as rotation, it is most convenient to introduce a coordinate system to 
distinguish different directions in space. Similarly, to formulate the supersymmetry, it is useful to 
introduce a coordinate 6 to distinguish bosons and fermions. It has to be an anticommuting spinor, 
since it relates bosons and fermions. Our conventions for spinors are summarized in Appendix.A. 
Anticommuting number is called Grassmann number. Combined with the space-time coordinates 
we have as coordinates in superspace. 

A function of x^,9 is called superfield. Because of anticommuting property, the su- 

perfield can be expanded in terms of Grassmann number to obtain the finite number of ordinary 
fields. In the case of four component Majorana spinor 0, the superfield contains 16 component of 
ordinary fields. Half of them are bosons and half of them are fermions. 

= C{x)+ei;ix) - - |075^M(x) 

+ iM75A(x) + Im^Dix) (4) 

Let US consider as a simplest transformation in the superspace an (infinitesimal) translation 
by € in the Grassmann number 9, To make it a nontrivial space-time symmetry, we shift also the 
space-time coordinate as follows, 

59 = e, 5x^ = -iej^9 (5) 

This form is the simplest possibility that is Lorentz covariant and is linear in e. This transformation 
is called the supertransformation. With this transformation, the superfield is transformed as 



548 


N Sakai 


= l^(x,d),iQ]=[^(x,S),Qe] (6) 

The first line is represented by a differential operator in terms of the Grassmann number acting on 
superfield, whereas the second line is expressed as a commutator between the quantized superfields 
and the supercharge Q which is the unitary operator for the supersymmetry transformation. It 
is useful to note that the basic definition of the supertransformation dictates that the Grassmann 
number 0, 0 have dimension of the square-root of the coordinate x"*. Useful formulas for derivatives 
of Grassmann numbers are summarized in the Appendix.B. 

To find the algebra satisfied by the supercharges, we make two successive supertransformations 
in eq.(5), and make the difference between the results of transformations in different order 


[$, [eiQ, Qe2]] = [eiQjCaQ]] = [[^i £i<3])e2(3] - [[$5€2Q],ei<3] 
= (d(e2))(^(ei))$-(5(ei))(d(e2))$ 
d 


de 


-1- i9'f’"dm £2 




il^Odn 


Hx,6) 


= 2ei7’"e2 (-i5m$(x,0)) = 2ei7'"€2 [$(x,0),P„ 


(7) 


Thus we find that the anticommutator of the supercharges is given by the space-time translation 
represented by the four-momentum operator P^, This property is a direct consequence of the 
space-time coordinate shift bilinear in Grassmann numbers in eq.(5). 

Since the chirality projection is useful in formulating supersymmetry, we shall use the two 
component notation for spinors from now on. The two component notation is summarized in 
Appendix.A. Then the anticommutators between supercharges are given by 

{Qoe, 0^3} = {Oaj Q/?} = 0, {OajQ/j} — 0 (8) 

The translation operator together with the Lorentz transformations form the group of 
space-time transformations, the Poincare group. The other commutation relations are found to 
have intuitive physical meaning. First the superchrages are translation invariant and transform as 
a spinor under the Lorentz transformations. 

[Q,P,n]=0,. = = (9) 

The rest of the algebra forms the ordinary algebra for the Poincare group. 

[-Pm, Pn] = 0, [P*", J"^] = -2(7;'”"P' - (10) 


jjrmr, jnij ^ ^ ^mljrn _ 


ijTl _ 


Thus we find, as promised, that the supersymmetry has two characteristic features: 

1. It involves the anticommutators, and 

2. It is a part of spacetime symmetry. 


(11) 


2.2 Unitary Representation 

Supersymmetry requires bosons and fermions to form a multiplet. To find the particle content 
dictated by the supersymmetry explicitly, we need to study the unitary representation of the 
supersymmetry algebra. 


2.2.1 = 1 Massive case 

Since the supersymmetry is a part of the space-time symmetry, we should combine unitary repre- 
sentations of Poincare group to form the unitary representation of the supersymmetry. To obtain 



Supersymmetry in Field Theory 


549 


the unitary representation of the Poincare group, we first diagonalize the four momentum 
For the massive case, we can choose the rest frame as the standard frame = (M, 0, 0, 0). The 
stability group that leaves the standard frame P^ = (M, 0, 0 , 0 ) unchanged is the 50(3) subgroup. 
The unitary representation of the 50(3) subgroup is labeled by the angular momentum j and its z 
component m. Now we should combine these representations {P^,j,Tn) of the Poincare group to 
obtain the unitary representation of the supercharge <3, since Q commutes with the four momenta 
[QjPm] = 0 . Since the supercharge has spin 1/2 as shown in eq.( 9 ), Q changes j and m by ±|. 
The anticommutators ( 8 ) between supercharges Q axe precisely the same algebra as the fermion 
creation and annihilation operators, if we rescale by V2M, 

{Qa. Qp} = {Qa. Q^} = 0 , Q^} = 2MS^^ ( 12 ) 

^ince there are 2 components of spinor indices, there are 2 kinds of “fermions”. We can regard 
Qaj d = 1, 2 as “annihilation operators”, and Q^, a = 1, 2 as “creation operators”. The unitary 
representations of these operators can be obtained by assuming ground state that is defined as the 
state annihilated by the “annihilation operators” Qa\j >= 0 , a = 1 , 2 . Here the ground state \j > 
is assumed to be an eigenstate of angular momentum j. Since the multiplication of the same type 
of supercharges vanish, we obtain only four possible states by applying the “creation operator” 

I Q^\^> \ ( i-k \ 

li > QiQ2\j > i 3 \ (13) 

V Q^\ 3 > ) V 3 + \ J 

The number of states in the multiplet is given by 4(2 j + 1), 7 = 0, Two lowest multiplets 

of the massive supermultiplet are explicitly shown in the table. 

1 . j = 0 case Chiral scalar multiplet 


spin j 

field 

degree of freedom 

0 

two real scalar 

2 

1/2 

a Majorana spinor 

2 


2. i = I case Vector multiplet 


spin j 

field 

degree of freedom 

0 

a real scalar 

1 

1/2 

2 Majorana spinor 

4 

1 

a real vector 

3 


2.2.2 iV = 1 Massless case 

In the case of massless particles, we can choose the standard frame as P^ = (P, 0, 0, P). The sta- 
bility group that leaves the standard frame P^ = (P> 0, 0, P) unchanged is the Euclid group in two 
dimensions E 2 * E 2 = -f «/^^)- It is well-known that the unitary representation 

of massless particles is labeled by the helicity [19]. In the standard frame, the nonvanishing 
anticommutator between supercharges is given by 

{Qa,Q0} = 2{ao + <T3),0P = 4p(^l q) (14) 

{Qi,(3i} = 4P, Qi = (Qi)* (15) 

Therefore we have only single fermion “creation and annihilation operators” . If we take the state 
of helicity A as the ground state (3||A >= 0, we obtain a multiplet consisting of only 2 states whose 
helicities differ by 1 / 2 . 

(|A>,QilA»~(^|A>,|A-i>) 


(16) 





550 


N Sakai 


Although the number of states in a multiplet is two, it is often required that the CPT invariance 
necessitates to combine states with opposite helicity if they are not in the same multiplet. Then 
the number of states becomes four. Frequently used multiplets are shown in the table. 

(A. A-i, -A + i -A) (17) 


highest 

helicity 

helicities 
of fields 

name of 
multiplet 


(io,o,-i) 

chiral scalar 
multiplet 

A= 1 

(1.1, -i-1) 

vector 

multiplet 

A = 2 

(2,1, -1,-2) 

graviton 

multiplet 


2.2.3 Extended Supersymmetry 

The most general supersymmetry algebra found by Haag et. al. contains N species of supercharges 
[18]. It is called the iV-extended supersymmetry. In two component notation, it reads 

{QtQpM} = {QlQf} = {Q^L,Q0 m} = 

= [X^^,X^^j = 0 (19) 

where X are called central charges. 

1. Let us first consider the massless case without central charges X. Similarly to eq.(14), the 
iV~extended supersymmetry gives Qf, L = 1, • • -jiV as fermion “creation operators”, if 
there are no central charges. Starting from the ground state with the helicity A, we descend 
in helicity by half unit in each step by operating Qf, 

|A>-^ 1A-|> -)-|A-l> >-|A-^> 

1 X 1 (20) 

The number of states is denoted below each helicity states and sums to 2-^. If the multiplet 
is not CPT self-conjugate, CPT conjugate states should be added. Two points are worth 
mentioning: 

• There are a number of arguments suggesting that consistent formulation of interacting 

massless fields is limited to spin up to two in four-dimensions. This limits the highest 
helicity to be less than or equal to 2. 

|A|<2, |A-||<2 (21) 

Therefore the highest possible supersymmetry is iV = 8 which gives 4 x 8 = 32 super- 
charges. The N = 8 supersymmetry in four-dimension is maximal, and it automatically 
contains graviton (A = ±2). Therefore the interacting AT = 8 supersymmetric theory is 
nothing but the N = 8 extended supergravity. 

• If one wants a renormalizable theory, highest helicity should be one or less. This limits 

N to be less than or equal to iV = 4: J < 1 => < 4. The maximal case gives the 

maximally supersymmetric gauge theory: N = 4 supersymmetric Yang-Mills theory. 




Supersymmetry in Field Theory 


551 


2. Massive N-extended supersymmetry case without central charge X allows 2N supercharges 

Qiy Q 2 J L = l,--‘,Xas fermion “creation operators”. We thus obtain the number of 
states in a multiplet to be 2^^{2j + 1), ; = 0, 

3. BPS states: 

If we have massive X-extended supersymmetry case with central charge X, we can have 
interesting situation called the BPS states where only a part of supersymmetry is maintained 
giving the smaller number of states in a single multiplet. Such a multiplet is sometimes called 
a short representation. 

Let us illustrate by an example in the N — 2 case that has SU (2) as an internal symme- 
try. Since the central charge has to be proportional to the invariant tensor of the internal 
symmetry SU{2), we parametrize 

XLM ^ 2Ze^^ ( 22 ) 

Let us take the rest frame = (M,0, 0, 0). The N = 2 supersymmetry algebra becomes 

= {QtQ^}=2Ze^pe^^, {QaL^Q^u) = (23) 

Then we have to consider both chirality of supercharges together. Since the anticommutator 
matrix must be positive definite matrix, we obtain an inequality 

M > IZf (24) 

This bound is called the BPS (Bogomolnyi-Prasad-Sommerfield) bound [20]. 

If M = l^l, there are zero eigenvalues for the matrix. This implies that a linear combination 
of Q's annihilates all the states, and cannot be used to create physical states. Therefore 
we obtain smaller number of particle states to represent the supersymmetry algebra. For 
example, if we have Z = M, we find convenient linear combinations of supercharges as 

Q(1) = Qi=i + q(2) = qL=i _ (25) 

These satisfy 

= (26) 

and all other anticommutators vanish. 

This is algebraically the same as the case of the massive N = 1 supersymmetry. Therefore 
the number of states is reduced by 1/4: 2^(2j + 1) ->• 2^{2j + 1). 

This phenomenon occurs when the determinant of the anticommutators of supercharges van- 
ishes. The resulting multiplet contains a smaller number of physical states and is csdled the 
BPS saturated states [21]. 

The physical origin of the central charge is often given by various nonperturbative objects 
such as monopoles, dyons, domain walls, in general some kind of solitons. 


2.3 Field Theory Realization 

2.3.1 Irreducible Representation 

The smallest unitary representation of the iV = 1 supersymmetry in four space-time dimensions 
requires two real spin 0 particles and two spin 1/2 particles. On the other hand, the general 
superfield ^{x,6,§) has 8 boson fields and 8 fermion fields, as we have seen in eq.(4). 

To obtain smallpr number of components than the general superfield, we should find a constraint 
that is compatible with the supersymmetry transformation to realize the supersymmetry in a 
fit nallpr space. This is a key ingredient to construct supersymmetric field theories. 

We note that the generd spinors 9a in four space-time dimensions has four components, whereas 
the chirally projected spinors 6a, So, have only two components. Therefore if we can construct a 



552 


N Sakai 


superfield that depends only on the chirally projected spinors, we should be able to reduce the 
number of component fields to half of those of the general superfield. Therefore we are tempted to 
use the constraint that the superfield be independent of the Grassmann number with one of the 
chirality 

-^§{x,ej) = 0 (27) 

Unfortunately even if this constraint is imposed, it is not satisfied after the supersymmetry trans- 
formation. 




7^0 


(28) 


Therefore this constraint is not consistent with supersymmetry. We can modify the derivative 
with respect to the Grassmann number by an additional term. We define the following covariant 
derivatives 


Da = 


d 

dB‘^ 


+ ico 




(30) 


These covariant derivatives anticommute with the supersymmetry transformation 


{D,, Q 0 ] = {Dot, Q^] = {Da, Q/?} = {Da, Q^] = 0 (31) 

Therefore they can be used to constrain the superfield to reduce the number of component fields 
by half. _ 

Da, Da satisfy the same algebra as Qa,Qd* 

{Da,Dd} = ~2i£Tacr^9ni, {Da,D/j} = {D(i,D^} = 0 (32) 


2.3.2 Chiral Scalar Field 

By using the covariant derivatives, we can now define the superfield which has half as many 
components as the general superfields in eq.(4). Since the supercharge anticommute with the 
covariant derivative as shown in eq.(31), these chiral scalar fields can be used as a representation 
space of supersymmetry. 

The (negative) chiral scalar superfield is defined by 

Da#(x,6>,^) = 0 (33) 


We can easily see that the following combination of variables satisfies this constraint 

-h D^y'^ = 0 


(34) 


Therefore the general solution of the constr^t is simply that the superfield depends on the d only 
through the combination iOa^O, 

$(y, 9) = A{y) + V^0V’(2/) + eOF{y) (35) 


The supertransformation of the chiral scalar superfield is given by means of the derivative 
operator defined in eq.(6). In the two component notation, we obtain 







dx'^ 


In terms of the component fields, we find 


d 


'dx'^ 


e 


^{x,e) 


(36) 


S^tp = i^(T”^ldmA + \/2F 


(37) 

(38) 



Supersymmetry in Field Theory 


553 


5^F = ( 39 ) 

It is important to note that the last component F is transformed into a derivative of the lower 
component. The supertransformation should increase the mass dimension by M i , However, the 
last component has the highest mass dimension and there is no component fields available except 
to consider the derivative of the lower component fields. This point is always true for the last 
component of the superfields. Hence the last component of the general superfield also transforms 
into a total derivative of lower component fields. 

It is important to realize that the chiral scalar field is complex. Therefore the scalar component 
A is a complex scalar field, and the fermionic component ^ is a complex Weyl spinor. Let us count 
the number of the degrees of freedom of component fields. If we do not use the equation of motion, 
there are two real scalar components from A and two real scalar components from F, and four 
real fermionic components from '0. We call this situation off-shell. Later we will construct the 
Lagrangian for this chiral scalar field. There we will find that ^ obeys the Dirac equation which 
reduces the on-shell degrees of freedom to half. Namely we have only a left-handed fermion and 
its anti-particle. As we noted previously, the mass dimension of the Grassmann number 6,6 is 
Therefore the mass dimension of the field F is if we take the mass dimension of the 
scalar component A to be as ordinarily required for the scalar field. As we will find w^hen 
constructing the Lagrangian, this implies that the F cannot have ordinary kinetic term with two 
derivatives and is an auxiliary field that can be expressed in terms of other fields. We summarize 
the counting of the number of degrees of freedom in the table. 


fields 

real or complex 
spin 

off-shell 
real d.o.f. 

on-shell 
real d.o.f. 

A 

complex 

scalar 

2 

2 


complex 
2-comp, spinor 

4 

2 

F 

complex 
aux. scalar 

2 

0 


Similarly, we can define the positive chiral scalar superfield by 

= O (40) 

The general solution of the constraint is given by 

= + + (41) 

Clearly the product of chiral scalar superfields is still a chiral scalar superfield as long as the 
chirality is the same. On the other hand, the product of posifive chiral and negative chiral scalar 
fields is a general superfield (without a definite chirality). 

The complex conjugation changes the chirality, since the complex variable y^ is changed into 
(y’”)* and the chirality of spinor is also changed by the complex conjugation {&)* = & 

($(y, = A*(y*) + ypm{y*) + MF*(y*) (42) 

2.3.3 Lagrangian Field Theory with Chiral Scalar Fields 

As we noted in sect.2.3.2, the last components of superfields always transform into a total deriva- 
tive. There are two possibilities for the superfields: chiral scalar superfield and general superfield. 
Therefore we have two candidates for the Lagrangian invariant under supersymmetry transforma- 
tion up to a total divergence: 

1. D-term of general superfield $ in eq.(4) 

[$(0,«)]i) = ^D^DH{e,e) 


(43) 




554 


N Sakai 


Since the product of chiral scalar superfield with opposite chirality is a general superfield, we 
can take the D term of the product. 

2. F±-term of chiral scalar superfield ^{6), ^(9) 

[$]f = (44) 

Let us consider Lagrangian field theory consisting of chiral scalar fields. Since the supertrans- 
formation does not leave any product of chiral scalar fields invariant, we have to be satisfied with 
the invariance up to total divergence. 

It is quite useful to examine the dimensions of various fields To give the canonical dimension 
to the scalar component [A] = we usually assume the dimension of the chiral scalar fields to 
be ML 

m9)] = [$(^)] = (45) 

Since the mass dimensions of the Grctssmann number is half of that of the coordinates, 

[e] = [e]=Li (46) 

we obtain that the covariant derivative has the mass dimensions as M ^ 

[D] = [D] = Mi (47) 

A renormalizable Lagrangian in four space-time dimensions requires that the Lagrangian should 
consist of operators with dimension < 4. We can list possible terms as follows. 

1. D-type: 

(48) 

Since the mass dimension of the product of covariant derivatives is [D‘^D‘^] = M-^, we see 
that there are no terms of this class. 

2. F-type: 

D^ia^i -h 6^1 $2 + c#i$2^3) = D^P(^) (49) 

Since has dimension ML up to third order polynomials of chiral scalar superfields of 
one chirality are renormalizable. To maintain the hermiticity of the Lagrangian, we need to 
add hermitian conjugate terms which consist of the chiral scalar fields of opposite chirality 
with conjugate coefiicients. The polynomial of chiral scalar superfield of the same chirality 
is called superpotential F. 

Now let us illustrate the above consideration with a simple example: general Lagrangian with 
a single chiral scalar field 

L = Lkin A Lint. (^f^) 

Lki„ = 

= \d^A*A - ]-duA*d''A + \A*d‘^A 
4 2 4 

+ F*F + 

■h It 

= A — -j- F*F -j- total derivatives (51) 

L-mt. = + +h.c. + s^^ 

= / [FA^ - ^tpA) + m (fa - 



+ sF -f h-c. 


(52) 



Supersymmetry in Field Theory 


555 


The Euler-Lagrange equation for F is given by 

F* + fA^ + mA + s = 0 (53) 

By solving this equation, we can eliminate the auxiliary field F from the Lagrangian L 

1 — 771 — 

- {fip'ipA* + h.c.) - \ JA^ + mA + (54) 

Let us suppose temporarily that the vacuum expectation value of the scalar field A vanishes. Then 
the parameter m gives the mass of a Majorana spinor -0 and a complex scalar *4. The parameter 
/ gives the Yukawa coupling and the scalar four point coupling in the potential. 


2.3.4 Supersymmetric gauge theory 

Ordinary local gauge transformation for the matter field in the representation corresponding 
to a matrix is given by 

(55) 

The matter field should be extended to a chiral scalar superfield ^{x,0) in the supersymmetric 
theory. In order to maintain chirality of the superfield, we need to extend the gauge parameter 
function A(x) to be a chiral scalar superfield A(a:,^). 

$(a;,0) exp{-‘iA^{x,6)T^)^{x,9) (56) 

Since the chiral scalar superfield contains a complex scalar field, supersymmetrized local gauge 
transformation actually contains scale transformations. 

The kinetic term of the matter fields should be made gauge invariant by introducing the gauge 
field. In supersymmetric field theory, the kinetic term of the chiral scalar fields consists of product 
of chiral scalar field with opposite chirality as in eq.(51). Therefore we need to introduce 
a general superfield as in eq.(4) instead of chiral scalar superfield. We see immediately that the 
general superfield contains vector field as a component. For this reason, the general superfield is 
sometimes called the vector superfield. The vector superfield V can be expanded in terms of 6,8 
to obtain component fields 

v{x,d) = c{x) + iexix)-iex{x) 

+ UeiM + iN) - ySiM - iN) - ea’^evmix) + i9e9{X{x) + y”"dmxix)) 

2 2 ^ 

- mm + y'^dmxix)) + lem (nix) + ^d^c{x)^ (s?) 

With this vector superfield, the supersymmetric version of the gauge transformations is given by 

(58) 


Here the general superfield V = belongs to the adjoint representation of the gauge group 

and g is the gauge counpling constant. It is dimensionless and real. 

V“* = V“ (59) 


With this gauge transformation, 
invariant. 


the kinetic term for the chiral scalar superfield becomes gauge 
tr ) -> tr (60) 


In order to examine the gauge transformation of the vector supermultiplet, it is simplest if we 
consider the U(l) case 

V^-^V + ^(A-A*) 


(61) 



556 


N Sakai 


A = A + \/29i; + e9F (62) 

D™ is an ordinary gauge field with ReA as the ordinary (real) gauge transformation parameter 

^ 1 (^ ^ ^. ) (gg) 

25 

A,D are gauge invariant. 

A A 

D D (64) 


C, X, M, N can be gauged away by ImA, ip, F in supersymmetric gauge parameter superfield A 

C -> C + -^{A-A*) 

25 

25 

M + iN -i- M + iN + -^F (65) 

25 

By exploiting the supersymmetric version of the gauge transformation, we can go to the Wess- 
Zumino gauge that is most popular to unravel the physical particle content of the model. 

Vwz = —6(x^Svm{x) i996X{x) 

- i999Xix) + le999D{x) (66) 

Since we have used the gauge transformation to go to the Wess-Zumino gauge, the Wess-Zumino 
gauge is not manifestly supersymmetric. In this gauge, supersymmetry is no longer manifest, but 
the invariance under the ordinary gauge transformation remains. The particle content can be most 
easily seen in the Wess-Zumino gauge. 

To form a Lagrangian, we need to build the gauge field strength as a gauge covariant building 
block. Among component fields of the vector superfield F, the gaugino field A'^(r:) is the gauge 
covariant field with lowest dimension. We can obtain this component by applying the covariant 
derivative D once and D twice. 

W, = ^{DD) = -fA. + ■ • ■ (67) 

Since we have differentiated twice in 9, Wa is a negative chiral superfield and gauge covariant 

= 0 ( 68 ) 

e-‘A“^V„e‘^‘^‘‘ (69) 

Similarly a positive chiral field strength is given by 

Wi, = ^(Z?D)(e2s'"“^.Die-25’^“^“) (70) 

og 

Since supersymmetric gauge field strength is a chiral superfield, the kinetic term for vector 
superfield is given by the F term of the square of the supersymmetric field strength 

igauge = gi?^(W“T^a) + h.C. (71) 

In the Wess-Zumino gauge, the Lagrangian is given in terms of the component fields as 


Lgauge = + 

= d^Vt, - dvVfi - 1 - ig[v^,Vv] 

Vp A = d^X + ig[v^, A] 

Similarly to the F fields, the last component D** is an auxiliary field. 


(72) 

(73) 

(74) 



Supersymmetry In Field Theory 


557 


2 A The General N = 1 Supersymmetry Lagrangian up to two Deriva- 

tives 


Since we are interested in effective action, we should not require the action to be renormalizable. 
Here we will write down the most general N = 1 supersymmetry Lagrangian in flat space (without 
gravity) which has up to two derivatives of fields. We have the following building blocks 

1. Field content 
Chiral superfield $ 

Vector superfield V 

2. Superpotential P{^) 

Interaction between chiral scalar fields are given by a function called superpotential which 
depends on the chiral scalar fields of the same chirality only. 

3. Kahler potential 

The kinetic term of the chiral scalar superfields is given by the D term of a general superfield 
that is given by a function of chiral scalar superfields of both chirality. Since the D term is 
taken, the kinetic term of the action is unchanged by a transformation with a function /(#) 
and /($) 

$) + /($) + /(#) (75) 

This invariance is called Kahler invariance. This function can be regarded as giving a ge- 
ometry of field space of the chiral scalar superfields. This geometry is called Kahler metric 
and the function is called Kahler potential. Additional term due to the gauge interaction is 
denoted as F. 

4. Gauge kinetic function Hah{^) 

Since the gauge kinetic term is given by the F term of supersymmetric gauge field strength, 
it can be multiplied by a function of chiral scalar fields which is called the gauge kinetic 
function. 

5. Fayet-Iliopouplos D-term for [7(1) i 

Since [7(1) vector superfield is neutral, the D term of the vector superfield is neutral and 
transforms into total derivative under the supertransformation. Therefore one can add a D 
term of the [7(1) vector superfield itself into the Lagrangian. 

We shall denote the F-type term as \Qe or and P-type term as \qqqq 




+ (PWU + /i-c-) 



The miniTna.1 forms of the Kahler potential and gauge kinetic function are given by 


(76) 


K + r = 


Hab ~ tr ^ah 


(77) 

(78) 


by 


On the other hand, an interesting example of the nonminimal gauge kinetic function is given 

HabiS) = ^Sab + SSab 


( 79 ) 



558 


N Sakai 


where 5 is a chiral scalar superfield which is a singlet of the gauge group. The mass dimension of 
the chiral scalar superfield and the superpotential P is 


[$] = M, [P($)] = 


If renormalizability is required, the superpotential P($) should be cubic or less in 
The equation of motion for the auxiliary field is given by 


1 dP' 




0 


9ij* — 


d^K 


^L=9‘ 


kn* 


dA^dA*i 

^ ,kn* 


d^K 


9 QjHQAmQji*n 

The equation of motion for auxiliary field D for minimal kinetic term is given by 


iD“ + 'LkA*'‘T°-A^ = 0 

9 

-D + + ^ = 0 

e 


(80) 

(81) 

(82) 

(83) 

(84) 

(85) 


2.5 Perturbative Nonrenormalization Theorem 

It has been very useful to use superfield perturbation theory to organize the perturbative correc- 
tions. The most interesting prediction of the superfield perturbation was the nonrenormalization 
theorems [22] [23]. Since the interaction among chiral scalar superfield consists of superfields with 
the same chirality, there is a selection rule based on purely algebraic identities on the chirality 
structure of possible loop corrections. By performing the algebra of Grassmann numbers, it has 
been shown that the loop corrections to all orders of perturbation do not give any P-type terms. 
This implies that not only the divergent terms but also finite terms do not appear in the P-type 
terms. The loop corrections in quantum effects appear only in the P-type terms. Therefore the 
following local terms can be generated in quantum effects. 

1. Kahler potential #) 

This gives the kinetic term of chiral scalar multiplet 

2. Gauge kinetic function Hab{^) 

This can give the nonminimal kinetic term for vector multiplet Although the gauge kinetic 
term is written as a P-type term, the gauge field strength actually involves the covariant 
derivative of opposite chirality. Therefore it can be generated in loop corrections. 

3. Fayet-Eliopoulos D-term for U{1) 

As a consequence, we obtain the following 

1. No quadratic divergences 

Typically the mass parameter can get quadratic divergences, but there is no loop corrections 
at all for parameters appearing in superpotential such as the mass parameters. 

2. No quantum corrections to masses and Yukawa couplings 

For such parameters in the superpotential, even a finite correction is absent. 

3. Only wave function renormalization and gauge coupling renormalization are needed. 

They are typically logarithmically divergent. 



Supersymmetry in Field Theory 


559 


Let us emphasize that the necessity of the wave function renormalization means that the pa- 
rameters such as mass, Yukawa coupling constant still run as one changes the scale. Therefore it 
is still meaningful to consider these parameters as effective coupling constants that depend on the 
energy scale. It should also be stressed that the above nonrenormalization theorem is obtained by 
the perturbation theory and is valid to all orders of perturbation. Therefore the nonperturbative 
effects can violate the nonrenormalization theorem. 

Another interesting perturbative result is that the beta function is exactly given by 1-loop in 
the iV = 2 supersymmetric gauge theories [24]. 

2.6 R-symmetry 

In supersymmetric theories, one can define a new type of symmetry called the R-symmetry. This 
is a continuous global symmetry that rotates phases of all the fermions relative to all the bosons. 
This is most easily achieved by a rotation of Grassmann numbers. 


9 


( 86 ) 


At the same time, one can assign an R-charge for chiral scalar superfield $: R^. 


$(^) A ^ (87) 

Ip i?(^) = i?(#) - 1 (88) 

On the other hand, there is no room to rotate the vector superfield, since a nontrivial charge 
assignment for vector superfield contradicts the nonlinear coupling of vector multiplet in gauge 
interactions as given in eq.(60). The vector multiplet gives a relative phase rotation between boson 
and fermion as 

V{e) ^ Y(e“-*"^), (89) 

Aa R{X) = +1 (90) 

We observe the following characteristic features in the i?-symmetry. 

1. R-symmetry is chiral. Therefore R-symmetry is generally anomalous. 

If there is another anomalous chiral symmetry, usually a linear combination is anomaly free. 

2. The mass term for the gaugino A breaks the R-symmetry 

£ = + h.c. (91) 

3. Superpotential P must have the R-charge R{P) = 2 

C = ^ (92) 

Therefore possible terms in superpotential are restricted if one wishes to have the supersym- 
metric theory to be invariant under the i?-symmetry transformation. 

4. Phenomenologically it is desirable to break the i?-symmetry explicitly. Since the massless 
gaugino is not observed in nature, J?-symmetry should be broken as is seen from eq.(91). The 
explicit breaking of the P-symmetry will allow massive gauginos without encountering (light) 
i?-axion resulting from the spontaneous breaking of the P-symmetry. To avoid a rapid proton 
decay, the P-paxity (-1)^ conservation is desirable replacing the continuous P-symmetry. 



560 


N Scikai 


3 Supersymmetric SU(S) x SU{2) x U{1) Model 

3.1 Yukawa Coupling 

3.1.1 Nonsup ersymme trie Standard Model 

Let us summarize the nonsup ersymmetric SU{2) x U{1) model emphasizing the structure of the 
Yukawa couplings. 

We have the (three) generations of the left-handed quark doublets the right-handed u- 
type quark singlets um^ and the right-handed d-type quark singlets dm We also have the (three) 
generations of the left-handed lepton doublets Ij, and the right-handed electrons em- Here j, * • * 
indicates the generation indices. 

We have complex Higgs doublets. Let us denote 


^ I Higgs to give masses to | ^ | type quark 


In terms of these fields, the Yukawa couplings / can be given by 


Lvukawa = + fd^Rifd^Qj + 


where 



(94) 

(95) 

(96) 


In the nonsupersymmetric model, nothing prevents choosing the Higgs doublet ^pu and ipd to be 
the complex conjugate of each other 

( 97 ) 


This is the choice in the nonsupersymmetric minimal standard model. 


3.1.2 Supersymmetric Standard Model 

It is important to note that the supersymmetric model requires the Yukawa interaction to be a 
term in the superpotential. This is an F-type term. The superfield in the Yukawa interaction 
should have the same chirality. 

Therefore we need two Higgs doublet superfields Hu and Hd as separate negative chiral scalar 
superfields. 

Hui^e-H2 (98) 

The supersymmetric Yukawa interaction is given by 

Lvukawa ~ d* h.C, (^^) 


P = fi^U9H^eQj + rjDIHjeQj + fiiE9HjeLj 

-k-uHleHd (100) 

where we denoted the negative chiral scalar superfield by capital letters and the charge conjugate 
of the positive chiral scalar superfield in terms of the upper suffix c. 

Higgsino (chiral fermion associated with the Higgs scalar) introduces anomaly in gauge currents. 
This anomaly has to be cancelled. Introducing the Hu and Hd as separate negative chiral scalar 
superfield serves to achieve the anomaly cancellation at the same time. 



Supersymmetry in Field Theory 


561 


3.2 Particle Content 

Now we find that we need at least a pair of Higgs doublet superfield, we will list the mi nim al 
particle content of the supersymmetric standard model. Our convention for the usual standard 
model U(l) charge Y is 

Q = h + Y (101) 

The mixing occurs among the following fields 

1. Chargino ifiu+ and 

2. Neutralino ipuo, ipdo, W°, B 

3. Scalar left-right mixing q and etc. 

We obtain the following i2-parity (—1)'^ to be conserved and there is no continuous i?-symmetry. 

• ordinary particles have (—1)^ = -H 

• Supersymmetry particles which are denoted with' have (-1)^ = -1 



j=i 

J = 112 

J = 0 

I 

Y 

SU{Z) 

Gauge fields 







G 

9m 

9 





W 


w 





B 

Bm 

B 





Higgs field 










Pu 

1 

2 

1 

2 




<f>d 

'Pd 

1 

2 

1 

2 


Quark field 







II 


9i 

Qi 

1 

2 

1 

6 

3 

U9 


“i 

u? 

0 

2 

3 

3* 

Dt 



dt 

0 

1 

3 

3* 

Lepton field 









h 

h 

1 

2 

1 

2 


m 


ei 

ei 

0 

1 

1 




Vi 

Vi 

0 

0 

) 


We have denoted the possible right-handed neutrino superfield as iV). 


4 N = 1 Supersymmetry Nonperturbative Dynamics 

4.1 Holomorphy 

4.1.1 iV = 1 Supersymmetry 

The chiral scalar superfield contains the complex scalar field A as the first component as shown in 
eq.(35). 

^ = A + + e^F ( 102 ) 

The distinction between negative chiral and positive chiral scalar superfield can be formulated as 
a distinction between holomorphic and anti-holomorphic fields. The former is associated with the 
complex variable z, whereas the latter is associated with the complex conjugate variable z. 



562 


N Sakai 


Since there are terms restricted to the function of the chiral scalar superfield with only one of the 
chiralities, we obtain a restriction related to the distinction of holomorphic and anti-holomorphic 
quantities. The principle to distinguish the chirality is called the holomorphy and gives the follow- 
ing restrictions 

1. The superpotential is restricted to be a holomorphic function. 

2. The Kahler potential and the Fayet-Iliopoulos D-term are not controlled by holomorphy. 

4.1.2 Complexified Symmetry Group 

The principle of holomorphy gives the following consequences. 

1. If a Lagrangian is invariant under a symmetry group G, it is automatically invariant under 
the complexification G^ of the symmetry group in the case of supersymmetric gauge theories, 
because of the holomorphy principle. 

2. To maintain the supersymmetry, the auxiliary fields have to vanish. 

F = 0 (103) 

This is a supersymmetric vacuum condition. One often finds parameters to specify the 
supersymmetric vacua. These parameters are called moduli. 

It has been shown that the moduli in supersymmetric gauge theories are given by gauge 
invariant holomorphic functions constrained by F = 0 [25]. 

Because of holomorphy the manifold of vacuum states ( moduli space ) is invariant under 
complexified symmetry group G^ 

3. It is usually most convenient to use the Wess-Zumino gauge to make the physical particle 
content manifest. The supersymmetric vacuum configuration in the Wess-Zumino gauge is 
given by the condition that both auxiliary fields should vanish: F = 0 and D = Q. Since 
the superpotential is invariant under the complexified symmetry group G^, F = 0 condition 
is invariant under G^. On the other hand, the kinetic term in the Wess-Zumino gauge 
is invariant under G, but not invariant under G^. Therefore the condition F = 0 is not 
invariant under G^. 

4. For Non Abelian gauge group, or Abelian gauge group without the Fayet-Iliopoulos D-term, 
it is sufficient to impose the condition F = 0. Even if the condition F = 0 is not met by 
the field configuration in G^, one can make a complexified gauge transformation to deform 
F to vanishing values F =. In this process, the condition F = 0 is unchanged because of the 
invariance of superpotential under the complexified gauge transformations. 

4.1.3 Wilsonian action 

In discussing the effective action for low energy field theories, we run across two different kind of 
the effective potentials. 

1. Wilsonian effective action 



(104) 

Q-Seff ^ J g-‘96are(0,A) 

(105) 


We have denoted the modes with momenta larger than the scale /a as 0>, and the modes 
with momenta smaller than the scale /i as 0<. 



Supersymmetry in Field Theory 


563 


In this definition, one integrates modes in momentum scales larger than the scale p that one 
is interested in : (^> in /i < p < A. In this definition, one usually suppose that there is a 
cut-off in the momentum integration to make the integral meaningful and is denoted as A. 
Therefore this can be defined for nonrenormalizable theories as well. This definition has the 
advantage of receiving no infrared divergences. This feature avoids anomalies to holomorphy. 
Therefore the Wilsonian effective action Sef / is a holomorphic function of parameters and 
background fields. It is also noted that the beta function in the Wilsonian action is l-loop 
exact in the iV = 1 supersymmetric theories [27]. This can most easily be found that the 
trace anomaly is in the same supermultiplet as the axial anomaly, since the energy-momentum 
tensor, supercurrent, and the axial current are in the same supermultiplet : 

On the other hand, the axial anomaly is 1-loop exact according to the Adler-Bardeen theorem 
[26], whereas the trace anomaly gives the beta function. Therefore the trace anomaly is also 
one-loop exact provided one does not have anomaly in holomorphy. 

2. One-Particle-Irreducible (IPI) effective action. 

This is the usual effective action in the sense of the generating function for the one particle 
irreducible amputated amplitudes. 


Z[J] = J 

( 107 ) 

dW 

$ = — 
dj 

( 108 ) 

r[#] = w[j] - /§ 

( 109 ) 


If there are massless particles, this effective action usually has an infrared divergences which 
produces an anomaly for holomorphy. Therefore the beta function in the one particle irre- 
ducible effective action receives contributions from all orders of perturbation. More specif- 
ically, it can be computed from the knowledge of the one-loop beta function together with 
the anomalous dimension coming from the wave function renormalization. 


m = 

a2 3r(G)-EiTO)(l-7i) 

(110) 

27r 1 

^ 27T 



li{a) = 

dlOglJ, TT 

(111) 


= c^iR) 

(112) 


tr(r“r“) = T(i2)^“*' 

(113) 


4.2 Nonperturbative Superpotential 

The holomorphy and symmetry requirements restrict the superpotential severely in the case of iV = 
1 supersymmetric field theories. Quite often these requirements are enough to fix the superpotential 
P completely. 

On the other hand, the Kahler potential is not holomorphic and is not constrained in the 
case of iV = 1 supersymmetry. Therefore the kinetic term cannot be determined in the iV = 1 
supersymmetric theories. If we use the N = 2 supersymmetry, however, the kinetic term of the 
chiral scalar field associated with the vector multiplet is related to the kinetic term of the vector 
multiplet. Therefore there is a possibility to determine the Kahler potential nonperturbatively. 

To find out the results on the nonperturbative effects, let us take the SU (Nc) gauge group as 
an example. As for the matter multiplets, we take Nf flavors of ” quark” and ” antiquark” chiral 
scalar superfields Q and Q in the fundamental representation of SU{Nc) gauge group. 


( 114 ) 



564 


N Sakad 


4.2.1 Nf < Nc 

Let us consider the massless supersymmetric QCD (SQCD) without superpotential. 


£o = j d^etx{QU^^^Q + Qe-^^^Q^} (115) 

tiW°‘Wc, + 


U‘ 


d^e tiWaW° 


(116) 


The global symmetry in this theory at the classical level is given by 

Gf = SUiNj)Q X SU{Nf)Q X U{1)b x U{1)a x U{1)x (117) 

Among them there are a number of Abelian global symmetries 

Q{d) ->■ e*“®+‘“''Q(e“‘“^0) (118) 

Q((9) ->e-‘“®+’“'"Q(e“*“^«) (119) 

V{6) -> y(e-*“*6l) (120) 

The symmetry 17(l)x is an i?-type symmetry which make the relative rotation between bosons 
and fermions. 

Let us illustrate how to determine the superpotential. 

1. There is an anomaly in £7(1)^ and £7(1) jf. 


= 


1 

327r2 






tr(t“t*‘) = iT(JJ)<5“‘ 


( 121 ) 

( 122 ) 


We can define an anomaly free i?-type symmetry U{1)r as a linear combination of U{1)a 
and U{l)x- Then the anomaly free U{1) quantum numbers are listed in the table. 


Chiral Field 

U{1)b 

U{1)r 

Q 

1 

l-NJNf 

Q 

-1 

1 - NjNf 


2. Let us next find out the transformation property of the parameter which describes the 
strength of the gauge interaction A. 

In order to see this, let us note that there is an instanton solution Ainst 

-F’^i/(Ainst) = i^lnst) (123) 

In this background, one finds that there are zero modes 'ip^o associated with the fermion field 
ip{x) whose number is determined by the index theorem. 


7^D^(Ai„st)0o = 0 (124) 

The number of zero modes for a chiral scalar field in the representation R is T(i?), which is 
the second Casimir for the representation. Similarly, the gauge fermion A has T{adj) of zero 




Supersymmetry in Field Theory 


565 


modes. The effective interaction among fermions can be found by considering the expectation 
value of an operator O 

(O) = J 
^ 0 

p“‘S’[Ainst] / ^fj.i-^inat)'4^)nonzeTO~~{^'y^ 

^ ^ I ^/nonzero'- Nonzero 

^n/ Dij^oD^oO 

where the value of the action at the instanton configuration is given by 


(125) 


5[4n 


Stt^ 


(126) 


Therefore we need to insert appropriate number of fermions in order to have nonvanishing 
contributions. 

/TiR) Tiadj) 


JJ Yl A ) oc exp 


Stt^ 


+ ie] 


(127) 


where the coefficient of the one-loop beta function is given by 5 = 3Nc — Nf. 

3. U{1)a transformation property of fermions are given for quarks and antiquarks by 




for gauginos 




q= 1 

(128) 

SA = 0 

(129) 


Therefore the theory can be made invariant provided we assign the transformation property 
for the parameter A as 


(n ^ n (n ^ n -^) 

The above result shows that the theory itself is not invariant under this U{1)a transformation. 
Therefore it is anomalous. The amount of the anomaly is such that we can relate the 
(different) theory by assigning the above transformation property to the parameter of the 
theory, A. By this transformation, we are relating different theories. This property becomes 
useful when we determine the nonperturbative superpotential. 

Therefore if we transform the parameter of the theory A as if it is a background field, we 
arrive at another theory related by the symmetry transformation. Hence there is a family 
of theories that are related by the transformations and the predictions of the theories are 
related by the transformation. 

j^3Nc-Nf ^ia{2NfqT{R)+qxT{adj))^3Nc-Nj ( 131 ) 


Namely can be regarded as having U{1)a charge 2NfqT{R) -f q\T{adj) = 2Nf. 

One should imagine that the parameter to be a kind of background fields when one considers 
the transformation of the parameter of the theory. This method has been used extensively 
by Seiberg and collaborators [28]. 

4. Let us constrain the superpotential of the low energy effective action by demanding several 
requirements successively. The principle of holomorphy requires that the superpotentiaJ has 
to be a function of negative chiral scalar superfields only. Gauge invariance requires that 



566 


N Sakai 


the superpotential should be a function of gauge invariant combinations of superfields. Since 
Mi^ = Q°'iQj is the only color singlet negative chiral scalar superfield for the case Nc > Nj, 
we find that the superpotential should be a function of Mi^ = Q^iQj ^ Let us note that the 
holomorphy forbids to use the gauge invariant combination of negative and positive chiral 
scalar superfields such as [QaYQj ^ The global symmetry SU{Nf) x SU{Nf) dictates that 
the effective superpotential P should be a function of det((3Q) only. 

PiQ,Q) = f (detiQQ)) (132) 

Next we can use the transformation property under the (anomalous) global U{1)a- As we 
have seen, the effective superpotential should be invariant under the transformation pro- 
vided we assign a U{1)a charge for the parameter as 2NfqT{R) + q\T{adj) = 2Nf. 

Therefore the superpotential should contain the parameter A as a function of the ratio 
A^^‘=“^-^/det((5Q) only. The dimensional analysis gives that the superpotential has to have 
the dimensions of Thus superpotential is determined except overall numerical constants 
CNcNf that depend on Nc and Nf, 


P = CncN^ 




1 

Nc-Nf 


det(QQ) J 


(133) 


This set of numerical constants can be determined by two consistency conditions regarding 
the decoupling: 

(a) If we give a large mass to a quark QjV/, it should decouple. This relates the Nf case 
with Nf — 1 case with Nc unchanged. 

(b) If we give a large vacuum expectation value to a squark Qi, the color gauge symmetry 

is partially broken and part of the flavor is decoupled. This relates the Nf case with 
JVc - - 1 case. 

These two consistency conditions reduce the numerical coefficients to a single number C. 


Cn.n, = {Nc - Nf)C^ 


(134) 


We can see that the A dependence of the Nf = Nc — 1 case agrees exactly with the one 
instanton contribution. Since the gauge symmetry is broken completely in this case, we 
can consider the large vacuum expectation values which corresponds to the weak coupling 
situation. Therefore we can trust the one-instanton calculation in this case and find 

C = 1 (135) 


The resulting nonperturbative exact superpotential can be summarized as 


Pnp = ^Nc-Nf (Nc — Nf) 




■Nf 


Ldet(Q(3)J 


(136) 


=1 (137) 

If we consider the large vacuum expectation values for all the quark flavors, the gauge symmetry 
is broken from SU{Nc) to SU{Nc — Nf). The effective coupling between these two gauge theories 
should match at the scale of the vacuum expectation values. This matching condition reads 


AyVcW/ \ 

(detQQ)^J 


3Nc-Nf 


f ^N^-Nf,0 

(detQQ)^ 


3(N,-Nf) 




(138) 



Supersymmetry in Field Theory 


567 


For Nf<Nc- 2, 

c = -^ld‘ew‘w, + -- 

The first component of the superpotential corresponds to the gaugino bilinear. Therefore the 
nonperturbative superpotential can be understood as gaugino condensation in the unbroken gauge 
group SU[Nc - Nf) 

22^2 ^ 0|A^Aq|0 >= (140) 

So far we have discussed the nonperturbative effects in the = 1 supersymmetric gauge 
theories. There has been much progress in recent years on the nonperturbative effects not only for 
the N = 1 supersymmetric theories but also for higher N supersymmetric theories that we have 
not enough space to cover. Among them it is worth mentioning that the exact solution for the 
low energy effective action of A" = 2 supersymmetric gauge theories has been obtained up to two 
derivatives including the full nonperturbative effects [29]. 

5 Summary 

1. Supersymmetry is the most promising solution to the gauge hierarchy problem. 

2. Supersymmetry is the only nontrivial relativisitic symmetry that relates particles with dif- 
ferent spin. 

3. Good progress has been made to understand the nonperturbative dynamics of supersymmet- 
ric gauge theories in both A = 1 and A = 2 supersymmetric theories . 




568 


N Sakai 


In this basis, the chiral 7 matrix becomes diagonal 



Since supersymmetry is conveniently formulated in terms of spinors of definite chirality, it is useful 
to decompose the usual four component spinor into upper and lower two component spinors with 
the definite chirality. 



The negative and positive chirality spinors have undotted and dotted indices which are raised 
and/or lowered by antisymmetric e tensor 


= -ei2 = 1, = SI 

(7) 

The conjugate spinor is given by 


= ( itr (^a)* ) = (’?“ = 4 ) 

(8) 


(9) 

The charge conjugation matrix C is defined by 


= -7”^^ 

(10) 


One can show that C is antisymmetric and can be chosen to be unitary = — C, C^G = 1. In 
the two-component notation using the Weyl basis, we have 


^ = -*7270 = ( Q .°2 ) = ( ) ( 11 ) 

The charge conjugate spinor corresponds to antiparticle and is defined by 

= (7^^, W = (12) 

The charge conjugation reverces the chirality 



Spinors which are charge conjugate of itself is called the majorana spinor 

^ ^ ^ ^ = ( 7 ?“ fja ) ( 14 ) 


Appendix B. Grassmann number and its derivatives 


Grassmann number is defined as the anticommuting c-number. The derivative in terms of 
Grassmann number is defined by 




( 1 ) 


dip, 


-h = ic-%, 


3 


( 2 ) 



Supersymmetry in Field Theory 


569 


Two-component notation 


di>a 9 ^/ 94 “ 




.d 

8 




^09 “ 

89^ 


d 


- 


- 


- 

y 

■ 

8t)^^ 


8 

dVa 

T]^ = 

^0oc 

^ 5 

a 


8 

8ri°‘ 

m = 


^ ffi 

TF'T 

dpa 


8 

d 

Qry 

8 

8 

dpa ~ 



dffr 


8 


8 

8 

ag 9 

dr]^ 

~^Oc0 

drig 

dficc ~ 



dfig 

89^ 

8 

89 

_ 8 
^^8h “ 

89^ 


References 


(3) 

(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 
(9) 

( 10 ) 


[1] H. Georgi and H. Glashow, Phys. Rev. Lett. 32 (1974) 438. 

[2] H. Georgi, H. Quinn and S. Weinberg, Phys. Rev. Lett 33 (1974) 451. 

[3] M. Veltman, Acta Phys. Pol.Bl2 (1981) 437. 

[4] G. ’t Hooft, in Recent Developments in Gauge Theories, Cargese summer school 1979 p.l35. 

[5] L. Susskind, Phys. Rev.mo (1979) 2619; S. Weinberg, Phys, Rev.Bld (1979) 1277; D13 
(1976) 974; S. Dimopoulos, and L. Susskind, Nucl. Phys. B155 (1979) 237; E. Eichten and 
K. Lane, Phys. Lett. B90 (1980) 125. 

[6] J. Wess and J. Bagger, Supersymmetry and Supergravity, Princeton University Press, (1992). 

[7] N. Sakai, Z. f. Phys. Cll (1981) 153. 

[8] S. Dimopoulos and H. Georgi, Nucl. Phys. B193 (1981) 150. 

[9] E. Witten, Nucl. Phys. B188 (1981) 513. 

[10] R. Kaul, Phys. LeU. B109 (1982) 19. 

[11] P. Fayet, Phys. Lett B69 (1977) 489. 

[12] For a review on supersymmetric models, see for instance, H.P. Nilles, Phys. Rep.CllQ (1984) 
1; P. Nath, R. Arnowitt, and A. Chamseddine, Applied N = 1 Supergravity, the ICTP Series 
in Theoretical Physics, VoLI (World scientific) 1984; H. Haber and G. Kane, Phys. Rep.CllT 
(1985) 75. 

[13] For a selection of papers on supersymmetric theories including models, see for instance, S. 
Ferrara, Supersymmetry, Vol.I and II (World scientific) 1987. 

[14] U. Amaldi, W. de Boer, and H. Fiirstenau, Phys. Lett B260 (1991) 447. 

[15] N. Arkani-Hamed, S. Dimopoulos, and G. Dvali, Phys. Lett B429 (1998) 263. 



570 


N Scikai 


[16] G. Paxisi and N. Sourlas, Phys. Rev. Lett4S (1979) 744. 

[17] S. Coleman and J. Mandula, Phys. Rev. 159 (1967) 1251. 

[18] R. Haag, J. Lopuszanski, and M. Sohnius, Nucl. Phys. B88 (1975) 257. 

[19] E. Wigner, Ann. Phys.40 (1939) 149. 

[20] M.K. Prasad and C.M. Sommerfield, Phys. Rev. Lett.S^ (1975) 760; E.B. Bogomolny, 
Sov. J. Nucl. Phys. 24 (1976) 449. 

[21] E. Witten and D. Olive, Phys. Lett. 78B (1978) 97. 

[22] M.T. Grisaru, W. Siegel, and M. Rocek, Nucl PAt/s.B159 (1979) 420. 

[23] K. Fujikawa and W. Lang, Nucl Phys.BSS (1975) 61. 

[24] P.S. Howe, K.S. Stelle, and P.C. West, Phys. Lett. I24B (1983) 55. 

[25] M.A. Luty and Washington Taylor IV, Phys. Rev. D53(1996) 3399. 

[26] S.L. Adler and W.A. Bardeen, Phys. Rev. 182(1969) 1517. 

[27] M.A. Shifman and A.L Vainshtein, Nucl Phys. B277 (1986) 456, B359 (1991) 571. 

[28] A summary on the the iV = 1 supersymmetric theories has been gievn in K. Intriligator 
and N. Seiberg, Lectures on supersymmetric gauge theories and electric-magnetic duality, 
hep-th/9509066. 

[29] N. Seiberg and E. Witten, Nucl Phys. B426 (1994) 19, B431 (1994) 484. 



22. Conformal Field Theory: A Bridge Over 

Troubled Waters 


W. Nahm • 

Physikalische Institut, Bonn University 


Abstract 

A perspective overview of QFT from its Dirac beginning to modern string theory is 
sketched, with special reference to the role of Conformal Field Theory in bridging the un- 
easy relationship between mathematics and physics through this entire century. 


1 Introduction 

In his 1972 address to the American Mathematical Society, Dyson deplored the ’divorce’ between 
mathematics and physics over the issue of quantum field theory. The present book on the impact 
of field theory on modern physics, timed in accordance with the International Mathematics Year 
of 2000 AD, gives hope that the rift will soon be bridged. In conformal quantum field theory in 
two dimensional spacetime, conditions are particularly favorable for gaining common ground. This 
area can attract both mathematicians and physicists by its beauty, its transparent mathematical 
structure and its many applications. Thus its investigation has moved to a rather central position 
in quantum field theory as a whole. Perhaps one can say that a bridge exists since many years, 
but has been used much below its capacity. 

Analogues in more than two dimensions will be mentioned in the present article, but they 
have not been developped very far nor found mathematical applications yet. For convenience, the 
expression ’conformal field theory’ will refer to conformally invariant quantum field theories in two 
dimensions, if nothing else is said. 

Its applications are suprisingly diverse. In mathematics, one particular theory has become 
well known due to its automorphism group, the Fischer-Griess monster, and by the Fields medal 
for Borcherds. Currently research on the many mirror symmetric cases makes rapid progress, 
whereas the explicit construction of Kahler-Einstein metrics is just at a planning stage. In physics, 
conformal field theory became essential for the study of continuous phase transitions in condensed 
matter physics, but the most important applications concern string theory. Even independently 
of its status as the best candidate for a theory of quantum gravity, string theory has become 
an important tool in quantum field theory. In particular, it yields a map from two-dimensional 
conformal theories to quantum field theories in higher dimensions, even to rather realistic examples 
in four-dimensional spacetime. 

It will be argued that conformal field theory can satisfy all mathematicians who want to un- 
derstand quantum field theory without giving up their standards of clarity and rigour. In return, 
many of its important aspects need advanced mathematical techniques. Some still remain out of 
the reach of physicists, others are handled in a manner which produces lots of undiscovered errors 
in the literature. A serious involvement of mathematicians will yield much firmer foundations for 
the things to come. 

Apart from conformal field theory, this article will discuss some string theory. The perturbative 
aspects of the latter are well understood by now and should be easily accessible. Indeed, their bulk 

* Email: werner<§av2w0 1 .physik.uiii-bonn.de 



572 


WNahm 


just constitutes a direct application of conformal field theory. By themselves, these aspects do 
not yet transcend the old principles of 20th century physics, but they are incomplete and seem to 
point in one unique direction of progress. There, non-perturbative string theory (or M-theory, or 
however one chooses to call it) has started to emerge and should lead to a new insight. Its birth 
might be a lot easier if mathematicians again get into the habit of acting as good mid wives. 

There is no lack of good will. Many conferences have been attended by mixed groups of 
mathematicians and physicists. Most importantly, Princeton has brought together many of the 
best people of both communities in a dedicated program. Nevertheless, one sometimes gets the 
feeling that a new generation will be needed to overcome the difficulties. 

It may help to recognize what the main obstacles have been, since a discussion of past successes 
and mistakes can prepare the way for future achievements. In any case, the present article is 
supposed to cover the early stages of the discoveries. Of course, history is complex, full of turbulence 
and countercurrents, such that precise statements would need more hedging or much more research. 
The historical content of the present article should therefore be taken as a signpost, not as a map. 
Its indications will be approximate, but may be helpful. Moreover, a low density of formulas should 
make the article accessible to a wider readership. Methodological qualms of the historians will be 
brazenly ignored. If they deny that it makes sense to ask what might have happened, what about 
an advanced quantum computer which reruns history, starting from a reasonable subspace of initial 
conditions. 

An invitation for mathematicians to cross the bridge also needs some technical parts, however. 
Mathematicians still think of quantum field theory as a useful source of ideas (cf. the Seiberg- 
Witten equations), but otherwise as impenetrable, though some related structures are regarded 
as good mathematics (topological quantum field theory, probably conformal field theory). ^Prom 
the point of view of a physicist, this is a strange attitude. By nature, conformal field theory 
was presented in the context of quantum field theory as a whole, so this is the way it should 
be discussed. Nature has a habit of posing such problems, recall infinitesimals and differential 
equations. Again, we should have confidence in her guidance. 

The article starts with a short introduction to the history of quantum field theory. For readers 
who want to get a fuller picture and the necessary references, there are good reviews and reprint 
volumes at different technical levels [Schwinger 1958, Pais 1986, Crease and Mann 1987]. In those 
reviews the aim is to show how nature was explained. Here, mathematically well-built structures 
will be regarded cts equally important for further progress. We shall see that the tools to build the 
bridge were available at the end of the 60’s, maybe even ten years earlier. 

Because of the focus on conformally invariant theories, other issues of mathematically rigorous 
quantum field theory, like the work of Glimm and Jaffe on superrenormalizable theories will not 
be discussed, however. Even for the main themes of the article, the selection is partial. Current 
algebras are one of the major themes of conformal field theory, but here the collaboration between 
mathematicians and physicists has been very fruitful since twenty years, and there are already 
nice reviews, e.g. [Goddard and Olive 1988], so the topic will receive less emphasis than would be 
necessary in a complete survey. Renormalization will be discussed in some detail, since most math- 
ematicians regard it as the major stumbling block which prevents an understanding of quantum 
field theory. Thus it may help to see that this procedure is rather easy from a mathematical point 
of view and follows a well known idea of the 19th century. Path integrals will not be mentioned. 
In quantum field theory as a whole the corresponding ideas have not yet found a satisfactory form, 
and even in conformal field theory one needs a discussion in terms of categories and operads, which 
only seems simple to the very good or the very young. The discussion of string theory is limited to 
the perspective of conformal field theory. A serious consideration of its non-perturbative aspects 
would have to start with a description of instantons and solitons. Many connections with confor- 
mal invariance could be explained, but not within the scope of the present article. Altogether, 
it is unavoidable that many readers will have reasons to complain, but at least they should feel 
encouraged to do better. 



Conformal Field Theory 


573 


2 After a golden age 

A theoretical physicists who looks back to the beginning of the past century has reasons to feel 
rather humble. A few decades witnessed three revolutionary insights in structures of nature, namely 
special relativity, quantum mechanics and Einstein’s theory of gravity. We still have to embed 
these in a unified theory. Meanwhile, we covered much territory in the study of the complexities 
of nature, but further understanding of her basic features proceeds at a snail’s pace. 

Evidently, the initial rapid progress relied on a close interaction between physicists and math- 
ematicians. We cannot even talk about those three structures without alluding to this fact. Just 
think of the Poincare group, Hilbert space and Riemannian geometry. 

In the first years of the century, many physicists felt doubtful or uneasy about the importance 
of contemporary mathematical methods. For example, Mittag-Leffler struggled in vain to get a 
Nobel prize for Poincare out of the old establishment. Around 1920, the movement had become 
irresistible, however. Einstein’s Nobel prize document avoids to mention special or general rela- 
tivity, but this was hardly more than a funny detail. Everyone went to Gottingen, Weyl solved 
the Schrodinger equation for the hydrogen atom, and Hilbert was an eager competitor when Ein- 
stein approached the final form of his equation for gravity. Of longer lasting importance was the 
clarification of the mathematical structure of quantum mechanics by v. Neumann and Weyl. Von 
Neumann had to immerse himself in physics because of the bomb and the computer, but WeyFs 
publication list in physics journals is impressive, too, and his exchange of ideas with Einstein and 
Pauli was particularly fruitful. 

Mathematicians made contributions of three kinds. Sometimes they solved concrete problems 
which the physicists found too difficult, but this was rare. More importantly, the internal logic 
of mathematics had led to the discovery of deep structures which found unexpected applications. 
Finally, the analysis of discoveries made in physics uncovered new mathematical worlds and allowed 
the physicists to think more clearly and efficiently about their own results. 

It is no surprise that the latter task attracts some of the best mathematical minds. Already 
in 1900, Hilbert saw that a renewed interest in physics would be productive, and he put the 
development of good axioms for mechanics as the sixth problem on his famous list. Twenty-five 
years later, classical mechanics had the necessary clear conceptual basis to allow the wonderful 
emergence of quantum mechanics. In contrast, the infinite dimensional spax:es of classical field 
theory remained less well understood. This contributed a bit to the confusion about quantum field 
theory, as we shall see. 

When a science is restructured during and after a major advance, it is particularly important to 
put what has been known before in a new context and to provide it with deeper foundations. Con- 
formal invariance in physics emerged in this way, though at first its place seemed to be marginal. 
Its discovery was triggered by special relativity. This theory had underlined the importance of 
symmetry groups and stimulated a new mathematical look at Maxwell’s equations. Cunningham 
[1910] and Bateman [1910] determined the maximal group of symmetries of the latter and dis- 
covered their conformal invariance. In other words. Maxwell’s system of equations is invariant 
under the maximal group of spacetime transformations which preserve angles, but not necessar- 
ily distances. Addressing mathematics and physics audiences, F. Klein repeatedly asked for an 
explanation, but in vain. 

Some progress became possible with the work of E. Noether [1918]. She was a creator of modern 
abstract algebra and only had a marginal interest in physics, but was motivated by Einstein’s grav- 
ity theory to explain the general connection between symmetries of Lagrangians and conservation 
laws. Prompted by Klein to analyse the problem with Noether’s method, Bessel-Hagen determined 
the conserved quantities corresponding to the conformal invariance of Maxwell’s equations [1921]. 
In the following years, several other differential equations were investigated. Most importantly, 
Pauli proved that the Dirac equation is conformally invariant when the mass vanishes [1940]. 



574 


WNsitim 


By that time, Hodge had found the tools for a deeper analysis [1941], which needed much 
longer to make headway into the physics community. He had investigated topological questions like 
Poincare duality from the point of view of differential forms. In this language, the electromagnetic 
field can be described by a 2-form F. In the absence of charges, Maxwell’s equations take the form 

dF = Q, d*F = 0, 

This simple form even applies to the curved spacetime of Einstein’s theory. Indeed, the differential 
operator d does not depend on any kind of metric structure. The Hodge duality operator *, which 
acts linearly on differential forms, depends on the metric. When applied to Ar-forms in a spacetime 
of 2k dimensions, however, the dependence on the distance scale cancels out. In particular, in 
the most important case of four-dimensional spacetime, only depends on the angles, in other 
words on the conformal structure. Moreover, the Lagrangian density of the electromagnetic field in 
empty space is given by the integral over F A *F. Since the metric again appears only through the 
Hodge star operation, the Poisson brackets derived from this Lagrangian have the same conformal 
invariance. This remains true for the corresponding quantum system, but in general conformal 
invariance is broken when one introduces charges. 

Eventually, this approach turned out to be very productive for physics, see e.g. [Atiyah, Hitchin, 
Singer 1978], but mainstream physicists learned about it only in the 70’s, largely through the efforts 
of Atiyah, who had heard Hodge’s lectures as a student. Indeed, in the 30’s a fault-line between the 
communities of the physicists and the mathematicians had started to develop. It must have been 
hard to spot at the time, since there were greater and more immediate concerns. Within physics, 
the split between theoreticians and experimentalists now became complete. Einstein still had done 
moderately respectable experimental work, but Heisenberg’s PhD exam was a near disaster, since 
he had concentrated all his efforts on theory. Some misgivings of the experimentalists were quite 
natural. Though there was no reason to expect that mathematical physics would regain the pres- 
tigeous position it had had in the times of Newton, Euler and Lagrange, the incredible popularity 
of Einstein and the diflSculties of his theories must have suggested to many that something had 
gone wrong. 

Finally, from the 30’s to the 50’s physics was hopelessly entangled in far more important events 
(fascism, the war, the bomb, Stalin, McCarthy, ...). This was a worldwide phenomenon, which left 
little refuge. It had one important positive aspect, however. Due to the international contacts, 
progress in physics was no longer the prerogative of Europe or North America. Most visibly, Japan 
and India had started to take part. 

All of this left little room for concern about the spreading rift between theoretical physics and 
mathematics. Around 1970, however, it had become too big to overlook. In his Gibbs Lecture, 
Dyson put it bluntly: ”the marriage between mathematics and physics ... has recently ended in 
divorce” [Dyson 1972]. 

The main culprit seemed to be quantum field theory. Here is a fairly typical quotation from a 
highly regaxded textbook: ”The mathematically inclined reader undoubtedly by now will have had 
serious misgivings about the validity and meaningfulness of the renormalization program, since 
this program has at its point of departure a set of meaningless equations which it then proceeds to 
manipulate according to rules which are outside the bounds of conventional mathematics to obtain 
(presumably) finite results (not to mention the fact these prescriptions, as outlined in the present 
chapter, are applicable only to the power series expansion of the ’meaningless equations,’ which 
power series expansion in all probability does not converge!)” [Schweber 1964, p. 645]. 

It is clear that something had gone wrong. In a sense, one may put the blame on nature, 
sinc^ she gave ambiguous directions. We considered the discoveries of special relativity, quantum 
mechanics and Einstein’s theory of gravity, but it is somewhat misleading to talk about them on 
a par, since the three theories do not occupy the same logical level. Einstein’s name for his theory 
of gravity was general relativity, because compatibility with the principles of special relativity was 
incorporated from its inception. Thus the task of unification would be finished, if one could join 



Conformal Field Theory 


575 


gravity and quantum mechanics in one move. ^Prom today’s point of view, this problem was too 
difficult and led into a thick fog. 

Instead one could follow the geometrical path indicated by Einstein’s gravity theory. This was 
natural for mathematicians, but not immediately productive for physics. Nevertheless, the search 
in these directions provided a favorable environment for the development of gauge theories, as we 
shall see later. 

For physicists, a different path was indicated by nature. After quantum mechanics had matured 
around 1926 (the year of the Schrodinger equation), the next fundamental problem was to put 
together quantum theory and special relativity. The essential guidance came from the experiments, 
whereas the mathematical structures remained rather obscure. In favorable circumstances, it 
still might have been possible to advance together, but many of the links between mathematics 
and physics were broken by the war. When a deeper study of the weak and strong interactions 
led to gauge theories, a convergence of the two paths was indicated, but this came too late for 
a reestablishment of the old contacts. In this sense the lack of mathematical accessability of 
relativistic quantum field theory is rather a consequence of the separation of mathematics and 
physics than its cause. 

Let us come back to the perspective of 1926. Classical physics deals with rigid bodies and with 
fields. Now the former were to be regarded as a low velocity approximation, since extended rigid 
bodies are incompatible with special relativity. When an object is touched, it cannot be affected 
all at once, since this would surpass the speed of light. 

Rigid bodies often had been approximated by point particles. Now they had to be considered 
in terms of pointlike constituents. In one space dimension, the latter can interact by collisions, but 
in more dimensions this makes little sense. Thus the only available candidates for the description 
of interactions in the real world were field theories. Conversely, the discovery of special relativity 
depended on an analysis of Maxwell’s equations for the electromagnetic field. In quantum physics, 
matter in the form of point particles was easily incorporated in this frame, since the Schrodinger 
wave function could be regarded as the avatar of a relativistic field. Thus the unification of special 
relativity and quantum theory demanded the formulation of quantum field theory. 

These ideas were well understood in the 1920’s. They came very naturally, since already the 
first steps of quantum mechanics were guided by quantum field theory; The first formula for 
a quanta! phenomenon was Planck’s radiation law for the electromagnetic fields emitted by a 
heated black body. Thus quantum electrodjmamics took shape immediately after the birth of 
modern quantum mechanics, in a 1927 paper by Dirac and, in more appropriate guise, in a paper 
by Heisenberg and Pauli in 1929. In its initial form, it was sufficient for a calculation of the 
semiclassical electromagnetic processes observable at that time. 

Soon, however, quantum field theory was put in doubt by experimental results and problems of 
consistency. For experimentalists, further work on the unification of special relativity and quantum 
theory posed a single basic challenge - study particle interactions at velocities close to the speed 
of light. Here early researchers were confronted with a bewildering wealth of data, from nuclear 
physics to cosmic rays. It took a long time until things were sorted out to reveal the xmderlying 
structures. In particular, it was far from obvious that the experimental results could be described 
by any kind of quantum field theory. For a long time, electromagnetism was the only interaction 
for which it made real sense. 

For mathematicians and physicists alike, this greatly diminished the attractiveness of quantum 
field theory. Most importantly, it contributed to the persistent but unproductive expectation of 
another revolution in the foundations of physics. In view of the previous decades, this expectation 
was quite understandable. Quantum mechanics had been developped for atomic physics, particle 
physics might need something equally revolutionary and exciting. Oppenheimer even gave a num- 
ber: don’t believe the old ideas beyond 100 MeV. For a while, there was a concrete reason for 
this attitude, Yukawa had predicted a particle of 100 MeV to explain the strong interaction, but 
when it seemed to be discovered, most of its other properties were wrong. Eventually, muons and 



576 


WNahm 


pions were distinguished and the paradoxes dissolved away, but the basic attitude surfaced again 
on many occasions. 

Quantum electrodynamics itself set other obstacles against the joint development of quantum 
field theory in a common effort of physicists and mathematicians. In particular, one immediately 
had to face one of the old problems of classical physics, namely the infinite energy in the electric 
field of point charges. In classical physics, spreading out the charge yielded a temporary excuse, 
but special relativity and quantum mechanics demanded the consideration of point charges, such 
that a clash was inevitable. There soon came a reason for hope, however. Against his expectations, 
Weisskopf (with a little help from Furry) showed in 1934 that in quantum electrodynamics the pole 
divergence of the classical theory is replaced by a mild logarithmic one. In the following years, 
Kramers explained the basic principles of regularization and renormalization. Weisskopf apparently 
was slowed down by discontent about his small mistake, but in 1939 he published a clear argument 
which indicated that any intrinsic inconsistencies of quantum electrodynamics were many orders 
of magnitude away from the experimentally observable domain. 

Altogether, in the 1930’s the stage was set for the further development of the theory, and a 
concerted effort of physicists and mathematicians was not completely out of the question. There 
was no single overwhelming obstacle. Still, the effort would have demanded an unlikely amount 
of patience and persistence against many stumbling blocks. Quantum electrodynamics has the 
typical difficulties of a gauge theory, and mathematics was not quite ready to provide elegant tools 
for their resolution. For the physicists, experiments had not yet provided a compelling reason to 
put much effort in the study of the small quantum electrodynamical effects of higher order, and 
to some extent the bewildering features of the other interactions undermined the faith in quantum 
field theory as a whole. The mathematicians had no reason to invest much work in something 
which might not last. They still were busy to consolidate the advances of quantum mechanics 
and gravity theory. Above all, they saw no compelling internal mathematical reason to develop 
quantum field theory. In hindsight, v. Neumann’s operator algebras came close, but they hardly 
became part of the mathematical mainstream, and v. Neumann soon had more important things 
to do. 

Since the time for quantum electrodynamics was not yet ripe, joint mathematical and physical 
progress would have needed another stroke of genius. One possibility would have been the creation 
of a rigorously solvable but non-free toy model, and from today’s point of view conformally invariant 
theories in two dimensions were by fax the best thing to be tried. Indeed, there was a little chance. 
Einstein had thought about conformal invariance, and Dirac got interested in 1936. At that time, 
Heisenberg just had started to work on quantum field theories with four-fermion interactions. 
These can be made conformally invariant in two dimensions, such that a joint effort might have led 
directly to the Thirring model and perhaps to its solution. After the wsir, Giirsey searched for a 
way to make Heisenberg’s four-fermion theory conformally inyariant, in the line of thought which 
went back to Cunningham and Bateman. He didn’t think about two dimensions, however, and 
wrote down a four-dimensional version with a cube root, which is impossible to quantize [Giirsey 
1956]. For Dirac and Heisenberg, it is unlikely, too, that they considered playing around in two 
dimensions. Moreover, Dirac became increasingly discontent with quantum field theory as a whole. 

Many of Heisenberg’s efforts were still creative and successful, but his flirt with mathematics 
was over. Still, his 1932 concept of a nucleon with two states, put four years later in the language 
of SU(2) invariance by Cassen and Condon, had initiated the group theoretic studies which from 
the 50’s onward became one of the major themes of particle physics. Here was perhaps a better 
chance for joint work with mathematicians, for which such considerations soon became very natural. 
Some physicists also looked at related structures, even before the experimentalists found convincing 
reasons to study internal symmetry groups or gauge symmetries. 

Einstein had long decided to concentrate on classical gravity and electromagnetism and kept 
away from quantum theory and the nuclear interactions. He continued to work in the context of 
classical differential geometry and played around with five dimensions and connections with torsion. 



Conformal Field Theory 


577 


These efforts are not highly regarded nowadays, but they familiarized the physics community 
with the work of E. Cartan and gave much support to the Kaluza-Klein ideas of five-dimensional 
spacetime. Yang argued that Einstein somehow was looking for the gauge theory found in 1954 by 
him and Mills [Yang 1982]. Indeed, Einstein repeatedly contemplated parallel transport without 
the metric constraint of the Levi-Civita connection. 

In the five-dimensional line of research, 0. Klein himself performed an amazing miracle by 
writing down the Lagrangian of SU(2) gauge theory during a 1938 conference in Warsaw. He only 
recognized the U(l) part of the symmetry and saw no clear physical applications, since charged 
vector mesons had not been found yet [Klein 1939, p. 93]. 

In 1953, Pauli rediscovered the same SU(2) gauge theory in a conceptually clearer way, when 
he pushed the Kaluza-Klein ideas one dimension higher and compactified two dimensions on 
such that the SO (3) symmetry became manifest. Pauli liked the result and described it in a letter 
to Pais. He did not publish, however, because he saw no mechanism to give mass to the gauge 
bosons. Together with Heisenberg he started to work on a fermionic Lagrangian with a four-fermion 
interaction, but he quickly saw that it made not much sense. Cut down from four to two dimensions 
it would have been transformed from a wrong unified theory to a fascinating mathematical toy. 
Altogether, Pauli and Weyl were probably the only ones of the pioneering giants who where both 
close to the mainstream and imaginative enough to push quantum field theory by inventing, e.g., a 
mathematically nice conformal field theory. In more fortunate times, Zurich might have witnessed 
such a step ahead, but it is hard to play in the shadow of war and persecution. 

With roots in a dominating wave of mood, the Nazi aversion against Jewish mathematics and 
physics had pervaded the German universities and the Gottingen environment was destroyed. The 
focus of research shifted from Europe to the USA. 


3 Progress in the face of mathematics 


After the war, the seminal event in the further development of quantum field theory was the Shelter 
Island conference. The decisive input came from the experimentalists, who made good use of the 
technology created in the years before. Their results implied that quantum electrodynamics had 
to he taken very seriously. Mathematicians were absent. Apparently, it had occurred to nobody 
that they might be of help. 

It seems that progress needed a new generation: Very attentive to the experiments, much less 
to new mathematical structures, conservative in its attachment to the old principles found in the 
golden age, careful and innovative in calculations. In Princeton, the lonely walks of Einstein and 
G5del were parts of a different world, faint reverberations of revolutions in a distant past. The 
young mathematicians had happy times. They developped fibre bundles, connections, character- 
istic classes, the deformation theory of complex structures and many other nice things. They laid 
the groundwork for modern physics, and couldn’t care less. 

In 1947, work on interacting quantum fields started in earnest. Bethe explained the Lamb 
shift, and soon after Schwinger calculated the anomalous magnetic moment of the electron. The 
calculations still were done with the theoretical tools of the prewar period. They started from 
the quantum theory of free fields and introduced perturbations according to the standard rules 
of quantum mechanics. Since quantum mechanical perturbation theory makes no use of Lorentz 
invariance, this procedure compounded the intrinsic difficulties. Soon after, Schwinger and Feyn- 
man developped relativistically invariant formalisms, and comparison of quantum electrodynamics 
with the experiments became very successful. Stueckelberg and Tomonaga had done earlier work 
in this direction, unfortunately with less impact. 

These calculational procedures were correct, but were derived from wrong standard assumptions 
by dubious mathematical methods. Soon, the standard assumptions were proven to be wrong by 



578 


WNahm 


a small group of mathematical physicists, whose work was based on an uncontested set of axioms. 
This caused some uneasiness among the calculators, and kept the mathematical community at a 
distance. To explain what happened, we have to consider some details of the Heisenberg and Pauli 
paper of 1929. They were the first to derive the equal time commutators of free fields. 

With respect to differentiation by space and time coordinates, quantum fields can satisfy differ- 
ential equations. In the simplest case, the latter are linear, as for Maxwell’s equation. Such fields 
are called free, since a linear combination of two solutions describes two waves which pass through 
each other without mutual influence. In the language of today, one starts with the space of classical 
solutions of the linear differential equation. On this space one needs a symplectic structure, given 
by a Poisson bracket, or equivalently a Heisenberg Lie algebra. In most cases one has an invari- 
ance with respect to a time translation group, the generator of which is the energy. Polarization 
with respect to the sign of the energy yields the appropriate Hilbert space representation of the 
Heisenberg Lie algebra (Fock space). 

Apparently, in the 20’s the symplectic structure of the space of classical solutions had not yet 
been grasped in depth. Thus the historical procedure was slightly more complicated and involved a 
special and somewhat formal choice of the classical observables. For free fields, it yielded canonical 
commutation relations in perfect analogy with Heisenberg’s commutation relations 

and the vanishing equal time commutators [xi^Xj] and [pi,Pj]. Analogously, for a free real scalar 
field (f) satisfying the Laplace differential equation, the equal time commutators [(l>{x),(j){y)] and 
[dt^{x)^di(l>{y)] both are zero. The remaining equal time commutator takes the form 

[<j){x),dt<l>iy)]=i5{x-y) . 


Note that distribution theory was not yet developped. This caused no physical problems at all, 
but meant that the use of Dirac’s S had no firm mathematical base yet. One may wonder, if this 
encouraged the physics community to ignore mathematical niceties. Of course, the mathematical 
justification was provided in the late 40’s by L. Schwartz, in a splendid case of interaction between 
the two communities. When this was done, one had a good description for free quantum fields as 
distributions over three-dimensional space. Once a fixed field (p is paired with a test function / 
(physicists write J (p(x)f(x)d^x), the result is an element of the Heisenberg Lie algebra and acts on 
the Hilbert space of the system. For real / the operator is hermitean and describes an observable, 
as usual in quantum mechanics. 

The analogy betvreen particles and free fields given by parsing from the Kronecker function Sij 
to Dirac’s S(x - y) was compelling, but proved to be very misleading. Heisenberg’s commutation 
relations for Xi,pi remain valid when interactions are present. In contrast, Haag showed that an 
interacting field theory cannot have canonical commutation relations. Indeed, interacting fields 
cannot even be understood as distributions over three-dimensional space at fixed time. Time 
averaging is necessary, too [Haag 1955]. 

In a special relativistic context, this might not have come as a big surprise. There even was 
a paper by Bohr and Rosenfeld which argued that a careful analysis of measurements implies a 
spacetime average [1933]. The arguments were clear enough and the paper was never forgotten, 
but its somewhat obscure style missed its mark on most of the new generation. 

Instead, adherence to the canonical commutation relations for quantum fields remained perva- 
sive in the physics literature till recent times, inspite of the fact that everyone knew it was wrong. 
Most probably, it was much more this attitude than the difficulties of renormalization which made 
it impossible for mathematicians to digest the intricate and important structures of quantum field 
theory. 



Conformal Field Theory 


579 


Historians will have to weigh this issue when the dust has settled. Despite of what has been 
said, they hardly can find a better starting point than the following classic quotation: ”In the thir- 
ties, under the demoralizing influence of quantum-theoretic perturbation theory, the mathematics 
required of a theoretical physicist was reduced to a rudimentary knowledge of the Latin and Greek 
alphabets.” (Jost) [Streater and Wightman 1964, p. 31]. 

The insistence of the physics community on using a wrong basis for successful calculations 
would be easy to understand, if no alternative formalism had been available. Due to Schwinger 
and Dyson, this was not the case. Dyson had read much mathematics and brought clarity of 
thinking to the muddled field. By 1949, Schwinger and Dyson had started to analyse quantum 
fields in terms of the n-point functions (or rather distributions) T{(l){xi, ti) . .. <l>{xn, tn))- Here for 
an operator A the real or complex number (^) is its expectation value in the vacuum state of 
the Hilbert space, and the analogous notation applies to distributions. The time ordering imposes 
the condition ti >...> tn on the support of the test functions. Moreover, in 1951 Schwinger 
published his action principle, which describes how an n-point function varies when one changes 
the parameters of the interaction. 

Thus most of the theoretical tools w^ere ready. On reading the tributes to Schwinger published 
after his death [Ng 1996], it seems that some obstacles to progress were personal. Schwinger had 
been a prodigy and the centre of attention. Apparently, he didn’t mind that his calculations 
remained almost incomprehensible. All that changed after 1948. In Schwinger’s own words: ’’Like 
the silicon chip of more recent years, the Feynman diagram was bringing computation to the 
masses” [Schwinger 1983, p. 343]. Dyson had a particularly clear understanding of the issues: 
’’The advantages of the Feynman theory are simplicity and ease of application, while those of 
Tomonaga-Schwinger are generality and theoretical completeness” [Dyson 1949, p. 486]. Schwinger 
forbade his students to mention Feynman or Dyson, or to use Feynman graphs. From a European 
perspective it seems that Einstein and Weyl would have had more reasons for grudges against 
Hilbert and Schrodinger, but one has to respect a difference of culture. 

In 1953, the Wightman axioms [Streater, Wightman 1964] were presented in lectures at Prince- 
ton. They were something of a mixed blessing. On one hand, they allowed clear proofs of structural 
statements, in particular of Haag’s insight that the canonical commutation relations are wrong for 
interacting theories [Haag 1955]. On the other hand, the axioms sacrificed the connection to the 
concrete quantum field theories which were under development. 

One technical detail needs comment. The Wightman axioms concern n-point distributions 
{(l){xi,ti ) . . . 0(xn, tn))-, but without time ordering. This seems mathematically convenient, for ex- 
ample when one wants to take Fourier transforms. Nevertheless, for contact with the experiments, 
the time ordering is natural. This became particularly clear with the LSZ formalism of Lehmann, 
Symanzik and Zimmermann, which provided a direct calculation of the results of scattering exper- 
iments in terms of the time ordered distributions. Different tkne orderings correspond to different 
experiments. 

The three authors were members of Heisenberg’s group, which attracted most of the young 
people who wanted to work on elementary particles in postwar Germany, Unfortunately, Heisenberg 
was hardly interested in mathematics and too occupied by his world formula to have much regard 
for the LSZ achievements. When Lehmann returned from the States, Heisenberg greeted him: ”Na, 
Herr Lehmann, wie geht’s der Mathematik?” (how is mathematics?), an episode which Lehmann 
never forgot. So much for the superiority of European culture. 

As an aside, any Third World country which wants to strengthen her scientific basis would be 
well advised to do a few case studies. The decline of physics in Germany is particularly interesting. 
One cannot put all of the blame on fascism, since mathematics did not suffer the same fate after 
the war, largely due to the achievements of Hirzebruch. 

The n-point distributions made mathematical sense, but were difficult to deal with. The next 
big advance was the introduction of the euclidean formalism, as discussed in [Osterwalder 1973]. 
Early on, Dyson had recognized that some calculations become much easier when one performs 



580 


W Nahm 


an analytic continuation to imaginary values of time (Wick rotation). The gestation of the idea 
took most of the 1950’s, with contributions from Wick, Nakano and, in condensed matter physics, 
Matsubara [1955]. It first appears in complete form in papers of Schwinger. 

In his 1993 lecture in Nottingham [Ng 1996], Schwinger states that it could have been published 
any time after 1951, but in fact ’’The Euclidean Structure of Relativistic Field Theory” appeared 
in 1958. Schwinger made an analytic continuation of the time-ordered n-point distributions to 
purely imaginary values of time. As Wightman had seen already, the analytic continuation allows 
to consider the distributions as boundary \’alues of ordinary analytic functions. Thus Schwinger’s 
idea allows to describe physics by functions of some £)-dimensional euclidean space instead of dis- 
tributions with testfunctions over i^-dimensional spacetime. By that time, mathematical physicists 
had mastered the difSculties of distribution theory, such that the due expression of relief was rather 
muted. Often, the euclidean n-point functions are regarded as distributions, too, but the present 
article will not follow this habit. 

As usual nowadays, Schwinger’s euclidean n-point functions will just be written in the form 
{<p{xi ) . ..<^(xn)), where the xi now denote points in D-dimensional euclidean space. These func- 
tions are real analytic and defined every w^here except on the partial diagonals xi = xj. Since there 
is no causal structure in euclidean space, the necessity of time ordering disappears. Accordingly, 
the functions are symmetric under permutation of the Xj. If one considers several fields (/> 2 , • • 
one has instead 

(Ai (xi)</^i4-i (xi-j-i )- 42 ) = (x2-|-i)<^i{x£)A*2) , 

where the Ak stand for products of fields at points different from Xj, Xj-j-i . In spacetime, all possible 
time orderings can be reached by analytic continuations starting from the same euclidean n-point 
function, a fact called crossing symmetry. 

Since the choice of quantum field theories is quite limited, their n-point functions should be 
special functions with very interesting properties. Not much is known about them, however. For 
free theories, they vanish unless n is even, in which case they reduce to sums of products over 2-point 
functions. The latter are variants of Bessel functions. For conformal field theories, one obtains 
functions of hypergeometric type. In some other cases in two dimensions, at least the 2-point 
functions are under good numerical control, but little is known about their analytic properties. It 
is quite possible that some examples will yield functions of Painleve type. Unfortunately, interest 
in special functions w^as at a low’ ebb in the past century, but this certainly wdll change again. 

Most quantum field theories have free parameters. The latter take values in some differentiable 
manifold which is called moduli space. Accordingly, the n-point functions can be differentiated with 
respect to these parameters. Let dx be a tangent vector in moduli space. According to Schwinger’s 
action principle, each tangent vector corresponds to some field t(x), such that formally 

d\{(l>iXi) . . . <p{Xn)) = j {t{x)(?{Xi)...(j)[Xn))dPx . 

The expression is formal, since the integral diverges when x approaches one of the xi and needs to 
be regularized. 

In general, there is no easy way to normalize the field cp. Of course, the canonical commutation 
relations would have provided a natural normalization, but they are wrong. When one changes 
the normalization by some factor /(A), the deri\ative of the n-point function changes by a term 
proportional to . . . ^(xn)). If the divergence of Schwdnger’s integral is of exactly this type, 

the freedom of normalization can be used to cancel it. This is the renormalization procedure, which 
wdll be discussed in more generality below. 

In principle, vector fields can be integrated, such that Schwinger’s action principle should 
allows to recover the moduli space from any of its regular points by higher order derivatives and 
the summation of the Taylor expansion. In many practical cases, however, the only explicitly 
known points of the moduli space lie at the boundary, where the space is no longer regular. As a 



Conformal Field Theory 


581 


consequence, the Taylor expansion is only asymptotic. This problem can be avoided for conformal 
field theories, but it will be mentioned again in the context of string theory. 

Many moduli spaces do not have a natural metric, such that the integration of a vector field 
has to follow an arbitrary smooth curve. Equivalently, one can choose local coordinates, also 
known as renormalization scheme. Indeed, without a metric on moduli space, the perturbing field 
t(x) does not have a natural normalization. Typically it lives in some finite dimensional vector 
bundle over moduli space which includes mass perturbations and coupling constant perturbations. 
When one takes higher order derivatives of the n-point functions, all of these parameters have to 
be considered together, which requires mass and coupling constant renormalizations of t(x). The 
finite ambiguities of the latter are fixed by the choice of a renormalization scheme. Changing them 
leads to a difierent curve for the integration. 

If one wants, one can include the constant field 1 in the vector bundle, but since one wants 
(1) = 1 it is usually more convenient to require (t(x)) = 0. This is called the renormalization of 
the vacuum energy density. 

In the 50’s, renormalization was well understood on a computational level, but before Wilson’s 
work in the late 60’s the concepts were not particularly clear. Nevertheless, the time was ripe for 
the first quantum field theory which was not free and made complete mathematical sense. 


4 Thirring model: Conformally Invariant QFT Is Born 


In 1958, W. Thirring published a paper with the title ’A Soluble Relativistic Field Theory’ (in 
Mathematical Reviews, it was described by Raychaudhury, Calcutta). The paper kept the promise 
of its title. Let me quote a few sentences: ’In spite of the great efforts of many people the 
mathematical structure of relativistic quantum fields is still in the dark. ... In order to study 
those (features) we propose in the present paper a model of a relativistic field theory... Since the 
reduction of the number of fields does not simplify the problem sufficiently ... one has to take 
recourse to a reduction of the dimensionality of the problem... Thus the simplest nontrivial case 
seems to be a one-dimensional Fermi-field with an interaction X'ijj'ip'ip'ip, Although the problem is 
of considerable complexity it turns out to be soluble. ... (The model) shows explicitly what a 
relativistic theory can look like. Furthermore it can serve as a testing ground for field theorists.” 

All of this is true. Perhaps the most remarkable part is the courage to do something simple in 
two dimensions. Here Thirring was inspired by the investigation of many-body systems in terms 
of the Bethe ansatz. In two dimensions, one can get interactions by collisions only, without fields. 
This knowledge led to the correct conjecture that the model would be solvable. Thirring also 
made some entirely correct remarks about Heisenberg’s unified four-fermion interaction theory in 
four-dimensional spacetime, which may have contributed to some tension between Munich and 
Vienna. Indeed, despite of the fact that part of Thirring’s work had been done at MIT and at the 
IAS, Princeton, one almost gets the impression that the creation of the model was a provincial 
non-event. The leading soluble model of the time was due to Lee (1954) and not relativistic. 
Thirring’s remark about the Lee model in his 1958 paper is not particularly deferential, but in his 
textbook with Henley [1962] he gives it two chapters, whereas his own model does not even seem 
to be hinted at. Schweber’s 1964 textbook doesn’t cite it either. 

Nevertheless, some of Schwinger’s former students had paid attention, and Johnson from MIT 
devoted a paper to the model [1961], I quote from the introduction: ’’Thirring has proposed a two 
dimension^ ... model which is of some interest because its exact solubility enables one to study 
some of the general conjectures which have been proposed in regard to the behaviour of local 
relativistic fields. In spite of the model, no general solutions have been proposed which are free 
from possible criticism because of the rather formal manner in which they have been obtained.” 
In the conclusion, Johnson states: ” We have shown how it is possible to solve the two dimensional 



582 


WNahm 


model of Thirring by making use of the existence of the two vector density conservation laws. ... 
It was shown how it is possible to define the products of the singular operators in order 

to determine other covariant operators but that these singular field products do not satisfy the 
equal time commutation relations with the field operators 'ipix), that one would obtain by means 
of the canonical commutation relations Again, all of this is correct. Still, some mathematical 
problems were left, but they were settled in the subsequent years. 

Let us describe the model in more detail. It is obtained by perturbing the theory of a massless 
complex fermion in two dimensions. In the euclidean formulation, the Dirac equation reduces to 
the Cauchy-Riemann equation and its complex conjugate. Real and imaginary parts of the fermion 
yield two holomorphic field 'ipi{z) and two anti-holomorphic fields 2 = 1,2. At this point 

in moduli space, the two conserved vector densities mentioned by Johnson are j{z) = 'ipi{z)tp 2 {^) 
and ]{z) = '0i(^)^2(^)- The conservation equations are the Cauchy-Riemann equation for j and 
its conjugate for j. The 2-point functions have the form 

(i(^i)i(^2)) = (zi -Z2r^ 
and analogously for (Jj), whereas (jj) = 0. 

In terms of Schwinger’s action principle, the perturbation corresponds to the field t = jj. It 
turns out that the n-point functions of j and j are unaffected by the perturbation. In particular, 
the two fields and their product t have a natural continuation over Thirring’s moduli space and 
need no renormalization. Moreover, the conservation equations do not change, which accounts for 
the solvability of the model. 

The special properties of j^j arise because they are currents, i.e. quantum analogues of the 
conserved densities which arise by Noether’s theorem from continuous symmetries. Because of 
their close relation to observable quantities, they behave similarly to free fields. This led to 
the concept of current algebra. In two dimensional theories, the currents of simple Lie groups 
generate the corresponding afiSne Kac-Moody algebra, at least when space is compactified to a 
circle. Unfortunately, the mathematical potential of ctirrent algebras was not realized for many 
years. The work of Kac and Moody in 1967 was independent of physics. In the context of string 
theory, it was introduced in the physics literattire by the mathematicians Lepowsky and Wilson 
[1978] and again by G. Segal [1981], and became a rare success story of physics and mathematics 
in cooperation. 

The Thirring model fields tpi do not remain holomorphic under the perturbation by jj. Instead, 
one obtains 

{lpi{Zx)li)j{Z2)) = {Zi -Z2)~'^\ZI -Z2\~^6ij , 

where the real number s changes under perturbation. Under the conformal transformation i->- 
2 ' = {az + b)f{cz -h d) with ad — 6c = 1, ^{ 2 ) {cz -{- d)”^ \cz.+ d|“"^^( 2 '), the two-point functions 
remain invariant. This remains true for all the n-point functions, such that the Thirring model 
is a conformally invariant theory. Initially, this seems to have been overlooked, and only the 
special case of invariance under scale transformations z \z was commented upon. This is a bit 
surprising, since in these years Thirring was very much concerned with conformal invariance. In 
the important 1962 paper where Gell-Mann introduced current algebra to the theory of the strong 
interactions, he acknowledges that Thirring introduced him to the conformal group. Moreover, 
conformal invariance had become an issue between Munich and Vienna. There was little internal 
logic in this local turbulence, but it turned out to be important and may be of interest to historically 
inclined people. 

Heisenberg had developped an interacting spinor theory for all of particle physics and pushed 
it for many years, though it made no sense. At the time, the new quantum number of strangeness 
demanded an explanation. Due to Noether, an invariance of the theory had to be found. Heisenberg 
tried scale invariance, though the theory has a length scale and a non-compact group has a hard 
time to yield discrete quantum numbers. The contemporary fashion for negative norm states, also 
present in the Lee model, gave some hope for a cure [Diirr 1959]. 



Conformal Field Theory 


583 


In Vienna, Cunningham and Bateman were remembered and Wess used Heisenberg’s attempts 
as justification for the resurrection of the conformal group. In a brief remark, he hinted at a possible 
use of the conformal group at high energies. Otherwise, he showed in a few pages that Heisenberg 
had missed the mark [Wess I960]. Since several of the few good young German theoreticians had 
flocked around Heisenberg, the paper triggered new interest in the conformal group, and Kastrup 
started to work on it, though Heisenberg did not pay much attention. Kastrup published papers 
on the possible importance of conformal invariance at high energies. During a visit to Russia, he 
explained it to Polyakov, as acknowledged in the first paper of the latter on conformal symmetry 
[1970]. This paper showed that scale invariance implies full conformal invariance. On the other 
side of the Atlantic, in his historic paper on the short distance expansion, Wilson ascribes the idea 
of scale invariance at short distance to Kastrup and his student Mack [Wilson 1969]. The fact that 
scale invariance implies full conformal invariance was recognized by Callan, Coleman and Jackiw, 
slightly before Polyakov’s work and in a different context [1969]. On the physical relevance of scale 
and conformal invariance, they cite 1969 papers by Mack and Salam and by Gross and Wess. 

Wilson’s short distance expansion was the main concept which still was lacking for a rigorous 
and calculationally efficient description of quantum field theory. It concerns the behaviour of the 
n-point functions along there singularities. Wilson considered them in Minkowskian spacetime, 
but the euclidean case is much easier. 

It has been mentioned that the euclidean n-point functions {(j>{xi) . . . (j>{xn)) are not well defined 
on the partial diagonal Xi = Xj. In general, the functions diverge on these diagonals. For a free 
field (p of dimension /i, the leading term at xi = X 2 is proportional to \xi - X 2 \^^^{(l>{xs ) . . . 4>{xn))- 
The case of several different fields needs a bit more discussion, but is not complicated either. 

There had been some speculation on the corresponding behaviour for interacting fields. One 
idea was that the singularity might be the same as for free fields. In 1964 Wilson conjectured that 
perturbations just introduce some logarithmic corrections. This was wrong, but one of Wilson’s 
talents was to talk to the right people for correcting mistakes. In particular, he had crucial discus- 
sions with Johnson, who familiarized him with the Thirring model. Wilson learned that the latter 
indeed is scale invariant, but that the dimension h changes with the strength of the interaction. 
Independently, the same modification to Wilson’s original ideas was made by Lowenstein. 

Wilson was a mainstream theorist on the way to a Nobel prize, but he did not fear to go against 
the tide: ’’The assumption that integrating an operator over space only gives an observable is a 
basic tenet of canonical field theory... The assumption has been rejected by axiomatic field theory 
from the beginning” [Wilson 1970, p. 1484]. In the same paper, he discusses a related issue and 
concludes: ’’The axiomatic view must in the end replace the popular view” [p. 1483]. It seems 
that the time was ripe to discuss all of quantum field theory in terms of statements which are at 
least potentially true. 

Before we discuss other aspects of Wilson’s work, let us continue the history of the Thirring 
model. At the end of the 60’s, string theory was invented and soon it was recognized that conformal 
field theory is an essential ingredient [Galli 1970]. HaJpern recognized the importance of the 
Thirring model in this context and informed Virasoro, who gave it publicity [1971]. A comparative 
investigation of the Thirring model and string physics in the context of conformal field theory was 
made by Ferrara, Grillo and Gatto [1972]. 

By 1974, it had become popular to elucidate the properties of quantum field theory by a study 
of two-dimensional examples. A particularly interesting one was the sine-Gordon model, which de- 
scribes a bosonic scalar field with trigonometric interaction term. Coleman wrote an elegant and 
deep paper where he showed that the perturbation by a fermion mass term makes the Thirring 
model isomorphic to the sine-Gordon model [1975]. This took everyone by surprise, since super- 
ficially the two models look entirely different and equally impenetrable in a strict mathematical 
sense. 

On hindsight, people remembered that the equivalence between fermions and bosons in two 
dimensions had been prefigured by Skyrme [1958,1961], but Skyrme had been too far ahead to 



584 


WNahm 


have an immediate impact. 

After Coleman’s paper, at last, one leading mathematician was shocked enough to take things 
seriously. G. Segal regarded the mass term as an unessential complication and concentrated on 
the boson-fermion equivalence. This was Coleman’s starting point and concerns an isomorphism 
between two conformally invariant theories. Initially, Segal felt quite sure that boson- fermion 
equivalence made no sense. When it turned out in the late 70 ’s that the equivalence leads to 
a combinatorial identity known already to Euler, a dam had been broken. Segal developped a 
beautiful system of axioms for conformally invariant quantum field theories in two dimensions and 
transformed the latter into a legitimate field of study for mathematicians [Segal 1988]. But even 
in their book on loop groups [Pressley, Segal 1986, p. 215] the authors state that a mathematically 
clear formulation of. the isomorphism between the massive Thirring model and the sine-Gordon 
model still seems not to have been found. 


5 Nature’s helping hand 


The long delay in the gestation of a correct theory of quantum fields would have been even longer 
without some direct help from nature. One reason is that the investigation of two-dimensional toy 
models was not taken very seriously by the particle physicists. Here is a quotation from a paper 
which reports the discovery of a fundamental property of the Thirring model: ’’The results are of 
interest ... because they allow one to see ver>’' readily (a) why the Thirring model is solvable and 
(b) why it has trivial physical consequences. As will be clear from the following, the solvability of 
this model depends critically on the fact that it is a 2-dimensional model. It is not likely that any 
of the specific features of this model can be generalized to more realistic cases, or that they will 
provide a useful guide to the state of affairs in the real world” [Callan, Dashen and Sharp 1968, p. 
1883]. 

Indeed, the highly non-trivial physical consequences of such conformal field theories in the 
context of string theory could not have been guessed in 1967. No wonder that the authors per- 
mitted themselves some sloppiness in the analysis: ”At this point, one could introduce the Fock 
representation for the scalar field, annihilation and creation operators, etc., and verify in detail 
that the energy and momentum operators have the expected properties, but there is little to be 
gained by going over these well-known details” [p. 1885]. This was a missed opportunity. For 
example, the commutation relations for the energy-momentum tensor given in the paper miss the 
central extension of what is now called the Virasoro algebra. What would have happened, if some 
interested mathematics student had tried to digest the paper? 

Since it seems that no mathematicians were interested, it was very kind of nature to provide her 
own motivation for the study of such models. In the 50’s, physicists were confronted unexpectedly 
with a rich class of quantum field theory in condensed matter laboratories, which turned out to be 
conformal field theories in the real world of two-dimensional surface coatings or three dimensional 
liquids. 

After Feynman’s breakthrough in 1948, his graph methods soon were transferred to other fields 
of physics. Their application in condensed matter physics was pioneered by Salam [1953] and 
Matsubara [1955]. In particular, Matsubara recognized the perfect analogy of imaginary time and 
temperature, due to the relation between the time translation exp{iHt) in quantum mechanics and 
the Boltzmann factor exp(—H/T) in statistical mechanics. 

When continuous phase transitions were studied, it turned out that the analogies went much 
deeper. At the critical temperatures, the behaviour of the materials is dominated by long range 
fluctuations of arbitrary scales, and the details of the molecular structure become unimportant. 
The theory approaches a continuum limit. The correlation functions of the limiting theory behave 
exactly like the euclidean n-point functions of quantum field theory. In this way, many statistical 



Conformal Field Theory 


585 


systems at continuous phase transitions are related to quantum field theories in spacetime by 
analytic continuation. 

Thus nature herself had declared that the Wick rotation introduced by Schwinger makes good 
sense. Of course, the dimensions of the observed examples are different, since the phase transitions 
happen in two or three dimensional systems, whereas spacetime has four dimensions. Moreover, 
the natural constraints on the field theories are not the same. Quantum field theories need a 
probability interpretation, which is realized by positive scalar products. Under Wick rotation, this 
becomes Osterwalder-Schrader positivity, which is not a necessary property of phase transitions. 
On the other hand, statistical observables are given by real numbers. This real structure yields a 
time reversal invariance of the corresponding quantum field theory, a property not shared by all 
examples and only approximately true in nature. On a purely mathematical level, these difficulties 
are not particularly serious, however. 

Lab experiments on phase transitions were much cheaper than particle physics with high energy' 
accelerators. Moreover, there were no worries that a breakthrough in the domain of the funda- 
mental laws was necessary. Thus progress was rather steady, both on the experimental and the 
theoretical side. Soon it became clear that the physics at the critical phase transition point is scale 
invariant [Kadanoff 1966]. Much of the relevant work on these euclidean quantum field theories 
was done in the Soviet Union, and Polyakov was one of the most important contributors. He found 
convincing arguments that scale invariance implies full conformal invariance at the critical point 
and recognized that this invariance allowed a calculation of the 3-point functions up to a constant 
factor [Polyakov 1970]. 

Further developments depended on the analysis of a soluble example in the context of statistical 
mechanics. This might have been provided by the Thirring model, which had occurred in its bosonic 
description, and was called the gaussian model. Because of the somewhat misleading simplicity 
of the bosonic formulation, the subtle features of its fermionic fields were not recognized in this 
context. Instead, the Ising model played a role for the study of continuous phase transitions which 
was parallel to the one of the Thirring model for particle physicists. 

The states of the Ising model put a number 1 or -1 to each site of a rectangular lattice. The 
latter are called values of the Ising spin. Pairs of nearest neighbours have an interaction energy 
which depends on the product of their Ising spins. The total energy E is given by a sum over 
the interaction energies of such pairs. The thermodynamic partition functions at temperature T 
is given by the average of exp{—E/T) over all states. 

Rectangular lattices can be considered in various dimensions. The thermodynamic functions 
for the linear or one dimensional model are very easy to calculate. The problem was given by Lenz 
as part of a PhD thesis to a rather weak student, who did not do any later scientific work. One 
hardly can imagine an easier way to lasting fame. The two-dimensional model, where the Ising 
spins sit on a square lattice, was solvable but very hard. The breakthrough calculation was due 
to Onsager [1944]. There is a critical temperature where the model turns into a rather simple 
euclidean quantum field theory. In particular, at this point the spin waves of the model satisfy 
the two-dimensional Dirac equation for free massless fermions, as first noted by Kadanoff [1969]. 
This equation is conformally invariant, as in the more complicated four-dimensional situation. In 
contrast to the complex fermion of the Thirring model, the fermion field of the Ising model is real. 
In this sense, the Ising model at its critical temperature has half as many degrees of freedom as 
the Thirring model. 

The two-dimensional Ising model is not just a theory of free fermions, however. The average 
values of the Ising spins turn into a field with scaling dimension 1/8. This result proved to be a 
highly non-trivial check which uncovered the failures of many calculational methods. 

Now two different conformally invariant quantum field theories were available, the Ising model 
in statistical mechanics and the Thirring model in conventional relativistic quantum field theory. 
They were used for very much the same theoretical tools, in particular the short distance expansion. 
Wilson discovered it in 1964 in in the Minkowskian context, Polyakov and Kadanoff in 1969 in 



586 


WNahm 


the euclidean. Polyakov called it correlation coalescence, Kadanoff reduction hypothesis. Wilson 
called it operator product expansion, and this terminology has survived, because it clearly has the 
priority. In the context of statistical mechanics it is not appropriate, however, since there are no 
operators around. Since it has advantages to have a unique name in both contexts, we use the 
common synonym short distance expansion. 

A scale invariant n-point function of type {^(x)<l)(y)A) has a leading singularity at x = y 
proportional to \x — where h is the scaling dimension of <^. When this leading singularity is 
subtracted, the next term behaves like ixiiu)^)- Here xi is some other field of scaling 

dimension hi > 0 , which can be measured in the way just described. Subtracting this subleading 
term one finds \x — where X2 now has a larger scaling dimension /12 > hi. The 

procedure can be repeated as far as one wants to go. One will find an infinity of fields of ever 
higher scaling dimension. Note that the Xi ^tre independent of the fields included in A. 

One can apply the same procedure to other n-point functions like {(I>{xi)xi(x2) • • •) ^nd so on 
and produce as many new fields as possible. The short distance expansion now states that for any 
real number ho there is only a finite number of linearly independent fields of scaling dimensions 
< ho* This property can be verified in many concrete examples and may very well be taken as 
part of the mathematical definition of a quantum field theory. 

Lattice systems are scale invariant at the exact temperature of a continuous phase transition. 
When the temperature is changed a bit, the correlations will show an exponential decay at large 
distances. When one is sufficiently close to the critical temperature, the corresponding correlation 
length is still very large compared to the distance between neighbours. With a suitable limiting 
procedure, one obtains the n-point functions of a quantum field theory which is no longer con- 
formally invariant. In this case, more complicated expressions than \x — will occur in the 
n-point functions. At the very least, one expects logarithmic correction factors. Nevertheless, the 
basic idea of the short distance expansion applies as before. 

Let us consider a euclidean n-point function {(p{x)xiy)A)^ where A is a product of local fields 
at positions different from r, y . An experimentalist may study the behaviour of this function when 
X approaches y. Each such measurement can be interpreted as the measurement of some field 
at y. This is the physical content of the short distance expansion. We can axdomatize it in the 
following way. Let r(y) be the vector space of germs of functions which are defined near y, but 
not at the point y itself. We give a topology to this space by using o{\x — y|^), s G R as a basis of 
neighborhoods of 0 in T{y). Let 7 be an element of the dual of r(y). Then for each pair of fields 
and each h there must be a field ^ such that j{(p{x)xiy)A) = {'ilj{y)A) for arbitrary A. One 
just can write 'y{(t>{x),x{y)) = 

Consider the vector space F of all fields of a quantum field theory. This vector space is filtered 
by the scaling dimension. Let F{h) be the subspace of all fields of scaling dimension less or 
equal to /i. We assume that these subspaces are finite dimensional. We also assume that the 
theory has some degree of asymptotic scale invariance. More precisely, ^ 6 -h ^2 + ^0) when 
(j> G F{hi)y X ^ ^^(^2) and 7 vanishes on o(\x — yj^) for h > ho. This condition will be important 
for renormalizability. Finally, dim F{h) should not increase faster than for free theories. In two 
dimensions, this yields log(dim F(/i)) = 0 {y/h). 

In this way one obtains a nice algebraic structure which is well adapted to calculational pur- 
poses. It does not contradict the Wightman axioms, but emphasizes quite different aspects. 
Whereas those axioms concentrate on one field, or maybe a few, the short distance expansion 
considers all possible fields at once. For mathematicians, this is certainly the more natural pro- 
cedure. To some extent, it eliminates the surprise one first feels about the equivalence of the 
sine-Gordon and the massive Thirring model, since in the latter one immediately has to include 
its bosonic fields, too. 



Conformal Field Theory 


587 


6 Regularization and renormalization 

With the help of the short distance expansion, it is rather easy to put renormalization in a standard 
mathematical frame. First we have to generalize the change of normalization of the fields which 
we considered above. Instead, we will use all the linear transformations of F which conserve the 
subspaces F{h). The group of these linear transformations will be called L{F). 

We want to regard a perturbation of some theory. In accordance with Schwinger’s action 
principle, the deformation is described by a field t{x). We shall see that in a spacetime of D 
dimensions, the scaling dimension of t must be D or less. 

The corresponding derivative of an n-point function ((^i(a:i) . . . <^n(^n)) is given by 
/ d^x{t{x)(l)i (xi) ... (t>n{xn))‘ The integral behaves well at infinity, but diverges when x approaches 
one of the Xi. Thus we regularize it by excluding a small neighborhood of size e around each Xi 
from the integration domain. Let us denote the resulting integral by 

The idea of renormalization means that the divergence can be absorbed by a redefinition of 
the fields. Such a redefinition is given by a linear transformation in L{F) of the fields which maps 
every subspace F{h) into itself. Using Wilson’s short distance expansion, one sees easily that there 
are transformations /(e) € L{F) such that 

j dFx{t{x)(j)x (li) . . . 4 )n{Xn)) “ ^(<^i(xi) . . . ifie)<j>i)ixi) . . .) 

i=0 

has a well defined limit when e goes to zero. Indeed, any divergent contribution 7 to the integral 
near xi vanishes when the n-point function behaves as o(\x — such that 7(t(x)<p) € F{hi), 

when hi is the scaling dimension of 

The transformation /(e) is only defined up to addition of a finite linear transformation in L{F). 
Any choice defines a connection on the filtered vector bundle F over the moduli space. Altogether, 
we now have well defined first derivatives in the moduli space of a quantum field theory. The 
calculation gets harder when one looks at higher derivatives, since the perturbing field t{x) will 
have to be renormalized, too, but this is just a technical difficulty. 

As One sees, renormalization is nothing particularly problematic. On the contrary, regular- 
ization of divergencies has a long history in mathematics. For exaimple, the Weiers trass product 
formula for entire function needs the regularization of an infinite product. Let us consider it in 
more detail. One wants a product formula for an entire holomorphic function P{?) with zeros 
exactly at given positions i = 1,2,..., more precisely a function with ^ divisor. 

The sequence Zi must have no accumulation point in the Gauss plane. When the number of zero 
positions is finite, the product ~ will do. The most general function with this divisor is 
exp(/(z)) ~ where f{z) is an arbitrary entire function. 

Now let us consider the case of an infinite number of positions. Factoring out a power of z if 
necessary, we may assume that none of the Zi is zero. Let us formulate Weierstrass’ solution in 
terms of the language of quantum field theory. We regularize the problem by restricting the set 
of zeros to Zi, i = 1, . , . ,iV'. Then we order the z* in accordance with their absolute value and 
renormalize the function form 

N 

PNiz) = exp(/N W) > 

i=l 


such that the limit limAr-400 Pn is finite. 

The situation in quantum field theory is quite analogous. The cut-oflF by e is analogous to 
the cut-off by N, the achievement of convergence by the renormalization transformation /(e) is 
analogous to the multiplication by exp(/iv')* renormalizable quantum field theories, fixing a 
finite number of parameters is sufficient to determine the n-point functions of a given finite set 



588 


WNahm 


of fields. In the case of the Weierstrass products, this is analogous to the situation where it is 
sufficient to take for the polynomials of fixed order r. In this case, one can normalize P by 
demanding that P(0) and the first r derivatives of P at z = 0 have prescribed values. This means 
that the solution P has r + 1 free parameters. For r = 0, the solution is 

iV 

P(2) = P(0) lim llil-z/zi) . 

N-^oo 


For r = 1 one obtains 


N 

P{z) = P(0) exp(zP'(0)/P(0)) lim TT(1 - z/zi) exg{zlzi) 

N~^oo 

2=1 


and so on. 

When polynomials do not suffice, the number of free parameters becomes infinite. Quantum 
field theory is simpler, since the latter case does not seem to have an analogue. Moreover, quantum 
field theories are far more constrained than entire functions, since they only have a finite number 
of parameters, in contrast to the infinite set of the zi. 

For conformal field theories, the Weierstrass product formula is more than a far-fetched ana- 
logue, since many correlation functions involve Jacobi’s theta-functions or Dedekind’s 77-function. 
Examples will be given below. Many important properties of these functions are best understood 
by their product formulas. 

As one sees, regularization and renormalization are perfectly standard mathematical proce- 
dures. Their unfamiliar context was bound to cause some delay in understanding, but it is hard 
to comprehend how a delay of many decades could come about. 

7 structure Of Conformally Invciriant Theories 


One important way to deform a quantum field theory has not been introduced so far. One can 
change all n-point functions by a simple rescaling of the distances. When this change can be 
compensated by a transformation in P(P), the theory is called scale invariant. More generally, 
the change is equivalent to such a transformation in addition to a change of the parameters of the 
theory. Infinitesimally, this equivalence is expressed by the Callan-Symanzik equation. 

When a deformation should respect some symmetry, the corresponding field t{x) must be in- 
variant under the symmetry group. In particular, this is true for Lorentz invariance. Indeed, 
our formalism does not require Lorentz invariance and can easily be adapted to quantum field 
theories on general spacetimes. One just has to replace the vector space F of fields by a bundle 
over spacetime. Let us conserve translational invariance, however, such that fields can be trans- 
ported in canonical ways between arbitrary points of spacetime. When some component of 
the Riemannian metric is changed in a translationally invariant way, the corresponding field t is 
the component of the energy momentum tensor. For a rescaling of the distances, this yields 
t = Tj^. For a scale invariant theory this means that the trace of the energy momentum tensor 
vanishes. Moreover, the integral must not depend on the distance scale, which means 

that the scaling dimension of the energy momentum tensor is equal to D, 

Scale invariant quantum field theories are conformally invariant, too. This implies that the 
three point functions are known explicitly. The four-point functions reduce to functions of a single 
variable. Such theories have a good chance to be solvable in a rather explicit form, but for theories 
in more than two dimensions, the situation is still rather unclear. Nevertheless, recent developments 
indicate that these theories are important, too [Maldacena 1998, Witten 1998]. Suppose that you 
have a quantum field theory in k dimensional Minkowski space which admits a deformation to the 



Conformal Field Theory 


589 


corresponding Anti-de-Sitter space. Recall that this is a homogeneous space of negative spatial 
curvature, with symmetry group SO(k - 1,2). Anti-de-Sitter space has a {k - l)-dimensional 
boundary at infinity with a conformal structure, on which SO(k — 1,2) acts as the group of 
conformal transformations. When one takes suitable limits of the n-point functions, the theory in 
Anti-de-Sitter space reduces to a conformally invariant theory in a space of one lower dimension. In 
principle, the higher dimensional theory can be recovered from the boundary theory by techniques 
of algebraic quantum field theory [Rehren 1999]. 

Perhaps this procedure can be iterated. In this way, the properties of theories in higher di- 
mension would be encoded in conformal field theories in two dimensions. This possibility is due 
to the typically quantum field theoretical fact that there is more freedom to construct conformal 
theories than higher dimensional quantum field theories in homogeneous spaces. In other words, 
the moduli spaces in higher spacetime dimensions have lovrer dimensions as manifolds, and can be 
embedded in the moduli spaces of quantum field theories in lower spacetime dimensions. As we 
shall see, string theory also performs such an encoding. It would be interesting to see if the two 
encodings are related. 

In the following, we only will consider conformal field theories in two dimensions. The amount 
of technical details will just about sufiice to put string theory in context. For a history of the 
crucial years 1984-88 and the relations to statistical mechanics, see [Itzykson, Saleur, Zuber 1988], 
which contains many references. A recent textbook is [di Francesco, Mathieu, Senechal 1997]. 

When one starts with a Minkowskian conformal field theory in flat spacetime, Wick rotation 
yields a euclidean theory on the Gauss plain. By conformal invariance, it is possible to compactify 
it to a theory on the Riemann sphere. As symmetry group, one obtains the group of linear rational 
transformations z {az -f b)/{cz 4- d) of the Riemann sphere. This will be the symmetry group 
of the n-point functions. 

In two dimensions, the energy momentum tensor is a symmetric 2x2 matrix. Because of scale 
invariance, its trace vanishes, such that it has only two independent components. By the Noether 
theorem, they are conserved quantities. More precisely, one linear combination is holomorphic, 
another anti-holomorphic. These are the famous Virasoro fields, which were first discovered in 
string theory [Virasoro 1970]. Their short-distance expansions are fixed by conformal invariance. 

The symmetry transformations z H* Az introduce a change of the n-point functions which can 
be compensated by a linear transformation in L{F). In most cases of interest, this transformation 
can be diagonalized. When a field transforms as ^ A^A^ 0, we say that (p has conformal 
dimensions (Ti, fi'). When A is real, we have a rescaling transformation. Thus h + h' is the scaling 
dimension of p. When |A| = 1, we obtain a rotation, with an action described by the conformal 
spin h - h\ Since a rotation by 27t is trivial, the conformal spin must be integral for bosonic fields. 
For holomorphic fields, A' = 0. Since the scaling dimension of the energy momentum tensor is 2, 
its holomorphic component has conformal dimensions (2,0) and its anti-holomorphic component 
has conformal dimensions (0,2). 

One could proceed in a purely algebraic way, completely within the framework for quantum 
field theories which was described above. Instead, let us shorten the path by some geometric 
intuition. Let us look at some holomorphic transformation z f{z) of a neighborhood of z = 0. 
Locally, this is a symmetry, since it does not change the angles. When /(O) = 0, it induces a 
transformation in L{F)^ since F can be considered as the space of fields at the point 0. The action 
of the transformations z i-4 Az on a field <f> of conformal dimensions (/i, h^) can be described by 
stating that the form 0(z)(dz)^(dz)^' is invariant. If this remains true for all /, the field <p is 
called primary. The primary fields span a subspace of F. If this subspace is finite dimensional, 
the corresponding conformal field theory is called minimal. The Ising model is minimal and has a 
three dimensional subspace of primary fields, but the Thirring model is not minimal. 

The short distance expansion of a holomorphic field (j> of conformal dimensions (/i,0) on an 
arbitrary field % is a Laurent expansion, since it depends holomorphically on z. We write it in the 
form 



590 


WNahm 


(t>{z)x{w) =^(2-w)" ^{<pnX){w) . 

n 

For all integers n, this defines linear operators (t>n on F. They are called the Fourier components 
of (t>. When x has conformal dimensions (h,h’), then (j)nX has conformal dimensions (n + ft, ft'). 
We regard F as graded by the conformal dimensions and see that (l>n is an operator of degree 
(n,0). The action of local conformal transformations on F is given by the linear operators in, in 
obtained from the holomorphic and anti-holomorphic Virasoro fields. 

For holomorphic fields ^he fields (pnX holomorphic, too, such that one obtains a new 
algebraic structure [Zamolodchikov 1985, Borcherds 1986, Goddard 1989]. A standard name in the 
physics literature is W-algebra, but mathematicians prefer to talk about vertex operator algebras. 
The latter name has the advantage of a clear history in string theory, whereas the W seems to 
be due to the accidental naming of some field as W (z) by Fateev and Zamolodchikov. Proposed 
allusions to Weyl, Wigner or Wilson are apocryphal, but may justify the name, which has the 
advantage of being short. 

The field 4>hX is called the normal ordered product of <j) and x* It is the first field which occurs 
in the regular part of the short distance expansion. In the Thirring model, the currents jf, j have 
conformal dimensions (1,0) and (0,1). The Virasoro fields are given by the normal ordered products 
jij/2 and ji]/2 [Callan, Dashen, Sharp 1967]. When n < the field (j>nX occurs in the singular 
part. It turns out that it can be described in terms of commutators [<^n,Xm]- Thus one part of 
the operations of the W-algebra just describes a Lie algebra. For the components Ln of the energy 
momentum tensor this is the Virasoro algebra. It was discovered by Gelfand and Fuks [1968] and is 
a central extension of the Lie algebra of vector fields on a circle. The value of the central extension 
is universally called c for the holomorphic Virasoro field and c for the anti-holomorphic one. In 
many models, they are equal. The values of c for the minimal models lie in a countable set. All 
of them have c < 1, whereas the Thirring model has c = 1. When (j) is holomorphic and x is 
anti-holomorphic, then [<j>nyXm] = 0. 

The action of the 4>n on the space F of all fields yields a representation of the W-algebra. 
With some effort, the representations of a fixed W-algebra can be given the structure of a tensor 
category, like the representations of a Lie algebra. The corresponding tensor product is called fusion 
product. The representation on the holomorphic fields themselves is called the basic representation 
amd behaves as the neutral element under fusion. 

Some W-algebras only have finitely many irreducible representations. These are called rational. 
In conformally invariant theories with rational W-algebras, all scaling dimensions are rational 
numbers. Such theories themselves are called rational, too. The minimal theories are characterized 
by the porperty that already the Virasoro part of the W-algebra has only finitely many irreducible 
representations. It is sufficient to consider the holomorphic Virasoro field, since for the anti- 
holomorphic one the situation is analogous. The properties of the Virasoro algebra only depend 
on the central extension c. The most interesting values occur for those minimal models where all 
the representations are unitary. This happens for c = 1 — 6/(p(p -f- 1)), p an integer greater 2. For 
p = 3 one finds c = 1/2 and the Ising model. 

The first investigation of these questions was due to Mack and Liischer. They found that c = 1/2 
is the lowest possible value and that there is a gap above 1/2. Here is one of the rare cases where 
progress depended on difficult calculations performed by a mathematician. V. Kac determined the 
structure of the representations [1979], which later allowed Belavin, Polyakov and Zamolodchikov 
to determine the values of c for all minimal models [1984]. Soon afterwards, Friedan, Qiu and 
Shenker determined the unitary cases [1984]. 

The discovery of the minimal models and their explicit solution by Belavin, Polyakov, Zamolod- 
chikov was the breakthrough event in the history of conformal field theory. It quickly became clear 
that these models are beautiful and fundamental mathematical structures. For reasons which are 



Conformal Field Theory 


591 


hard to understand in depth, very different kinds of such structures, from Platonic solids to singu- 
larities, can be classified in terms of the ADE Dynkin diagrams. The same is true for the minimal 
models [Cappelli, Itzykson, Zuber 1987]. 

Part of the excitements about these early publications came from the relationship to continuous 
phase transitions in statistical mechanics. Besides the Ising model, many other well known con- 
tinuous phase transitions were recognized as minimad models, like the ones for the 3-states Potts 
model, the tricritical Ising model, and the Lee- Yang edge singularity. Some of them have been 
realized in the lab, and measurements agree very well with the theoretical calculations. 

Many properties of continuous phase transitions now fell into place. For example, some phase 
transitions are characterized by universal rational numbers, others have free continuous parameters. 
The former now are described by conformal field theories which have no conformally invariant 
deformations. In particular, this is true for the minimal models, like the continuum limit of the 
Ising model. When conformally invariant deformations exist, then they do not change c. The first 
example was Baxter’s eight vertex model, which at the critical point becomes isomorphic to the 
older but more difficult Ashkin- Teller model. They yield c = 1, as for the closely related Thirring 
model. For a study of all unitary c = 1 models, see [Ginsparg 1988]. 

In some respects even simpler than the minimal models are those for which the Virasoro fields 
can be described in terms of normal ordered products of fields of conformal dimensions (1,0) and 
(0,1). Such fields are called currents, and the corresponding conserved integral quantities are called 
charges. The Thirring model is of this type, with currents j^j and single holomorphic and anti- 
holomorphic charges io? io- In more complex models where the two types of charges form simple Lie 
algebras, the short distance expansion of the currents yields the corresponding affine Kac-Moody 
algebras [Goddard, Olive 1988]. 

The Thirring model yields the simplest continuous family of conformal theories and has c = 1, 
too. In its bosonic description, it is given by a the statistical mechanics of maps to a circle. The 
model is rational when the area of this circle is a rational number. This means that the set of 
points for which the model is rational is dense in the whole family. At one particular rational point, 
another continuous deformation is possible, which generates the moduli space of the Ashkin-Teller 
phase transitions. This is a first example of the rather intricate geometry of such moduli spaces, 
with many number theoretic aspects. As a first step, It would be important to know the rational 
points of more complex moduli spaces, since rational theories have very explicit descriptions. So 
far, there are very few results. 

Within a moduli space of conformal theories, consider a perturbations by a field t{x). The 
integral f t(x)dzdz must be invariant under conformal transformations, such j;hat t should be a 
primary field of conformal dimensions (1,1). In the Thirring model, the field jj has these proper- 
ties. The dimension of the vector space of such fields counts the number of possible infinitesimal 
deformations. Thus it is an upper bound on the dimension of the moduli space. For generic points 
of this space, one expects that the two dimensions are equal. For the Thirring model, they are 
both equal to 1. 

The short distance expansion is a local property of the theory. When one wants to calculate 
the n-point functions, one also has to specify a Riemann surface on which the fields live (in the 
language of algebraic geometry, an algebraic curve). The simplest case is the Riemann sphere. 
Here the n-point functions of holomorphic fields are just rational functions. For more general 
fields, the results are much more complicated. For example, the four-point functions of minimal 
models already yield hypergeometric functions. 

The Riemann sphere is unique, but more complicated Riemann surfaces (or equivalently alge- 
braic curves) have their own continuous parameters. For example, a torus is described by the ratio 
T of two independent periods. When these are correctly ordered and varied continuously, r varies 
over the upper complex half-plane. The latter is called the Teichmiiller space of the curves with 
torus topology. Points of Teichmuller space describe the same torus when they are related by a 
different choice of periods. Changes of the periods are described by the modular group. This is 



592 


WNahm 


the group of linear rational transformations r ->• (ar + b)/{cT + d) with integral coefficients. More 
complicated curves behave in an analogous, but of course more complex way. 

The 0-point function on the torus is essentially the partition function of the theory. Since 
energy and momentum are given by linear combinations of the Virasoro field components Lq and 
Lq, the latter can be defined by 

Z = tr exj>(2m{LoT - Lof)) , 

where the trace goes over the vector space F of all fields. The 0-point function on a torus with 
parameter r has the form 

Z = exp(—2iri(cT - cf)/24)Z , 

where c, c are the central extensions of the theory. The prefactor is necessary to get invariance 
under the modular group. For the Ising model one obtains 

Z = \Y^\6i{T)h{T)\ , 

^ i=2 

where the 6i are Jacobi’s theta functions and tj is Dedekind’s function. Note that the scaling 
dimension 1/8 of the Ising spin can be read off from 62^ 

For a free complex fermion one obtains 

i=2 

This function arises at the parameter R = \/2 of the Thirring model partition function 

Z = \Ti{r)\-^ X^exp . 

m,n ^ ' 

Here m, n vary over the integers. The equality of the latter two functions for R = y/2 is an example 
of the fermion-boson equivalence mentioned above. In the gaussian description, only the terms with 
m = n = 0 were obvious, which explains why the model was considered to be uninteresting. 


8 String Theory 

Contrary to the historical developments, we have considered conformal field theory before coming 
to string theory. The reason is that string theory is more complex. Conformal field theory is 
just one ingredient, albeit an essential one. For general introductions to string theory and more 
references, see [Green, Schwarz, Witten 1987] and Polchinski [1998]. 

In 1968, Veneziano invented an amplitude for a scattering process with two incoming and two 
outgoing particles which shared several features with strong interaction processes. When a natural 
generalization to arbitrary particle numbers was found, Nambu, Nielsen and Susskind recognized 
that these amplitudes describe a one-dimensional object moving in space. The surface described by 
its motion is called a worldsheet. Its embedding into spacetime is described by functions r), 
where a, r are coordinates on the worldsheet and yields the corresponding spacetime positions. 


Calculational problems arise, because there is no canonical parametrization of the worldsheet. 
Some natural choice can be made, however. The causal structure of the ambient spacetime induces 
a causal structure on the worldsheet, with two lightlike tangent directions at each point. These 
directions can be integrated to lightlike curves. One chooses the coordinates such that their 
equations axe given by dr = da and dr = —da. This introduces a Minkowskian conformal structure 
on the GTjT parameter space. One chooses dr to be timelike and da to be spacelike. 



Conformal Field Theory 


593 


Strings have finite spatial extent, such that the range of a is compact. For open strings, the 
standard choice is an interval of length tt, for closed strings a circle of circumference tt. No further 
natural choices can be made, which means that the w^orldsheet dynamics is conformally invariant. 
In other words, the possible states of a single string are described by a conformal field theory. 
When one continues to a euclidean conformal field theory, one must make a Wick rotation in r, 
not in the time coordinate of X. The euclidean coordinate is called z. 

By the analytic continuation, the ’worldsheet becomes a Riemann surface. Let us consider the 
case of closed strings only. Both in Minkowskian and in euclidean space, the worldsheet has the 
topology of a cylinder. By conformal invariance, it can be compactified to a Riemann sphere with 
two special points, one for the incoming and one for the outgoing state. Such special points are 
called punctures. String interactions are introduced by considering arbitrary Riemann surfaces 
with different numbers of punctures. Calculating the scattering of n strings involves three steps. 
First, the string states have to be identified with fields on the worldsheet. Secondly, one has to 
calculate the corresponding n-point functions for all Riemann surfaces with n punctures. The 
surfaces are not necessarily connected, since some groups of strings can interact independently of 
the others. Thirdly, one has to integrate over all of these configurations, in particular over the 
position of the punctures. In addition, one has to integrate over the finite dimensional moduli 
space of complex structures on Riemann surfaces with a given genus (number of handles). This 
integral is not needed when one applies conformal field theory to statistical systems, since there 
the Riemann surface is fixed. 

Finally, one has to sum over the genera. Each term is multiplied by a power of the coupling 
constant. The exponent is proportional to an integral over the curvature, and can be normalized to 
g — 1. The leading contribution is given by ^ = 0 and as many connected components as possible. 
For vanishing coupling, this leads to a free theory, exactly as for a quantum field theory. Indeed, 
a string theory can be regarded as a quantum field theory which includes graviton fields. In some 
limit, gravity decouples and one obtains a field theory of conventional type. For the latter, the 
perturbation series is a sum over Feynman diagrams. A tubular neighborhood of such a graph 
yields a Riemann surface of some genus g. This allows to identify one of the field theory couplings 
with the string coupling- We shall see that the others correspond to parameters of a conformal 
field theory on the string .worldsheet. 

The sum over g is certainly not convergent, which provides a technical reason to develop non- 
perturbative string theory. A deeper reason is the following. As for quantum field theory, free 
string theory can be considered as a boundary stratum on some moduli space. This stratum is 
characterized by the vanishing of a coupling constant, but in many cases its codimension is larger 
than one. Thus an expansion in the coupling constant cannot recover the full theory. In particular, 
it has no reason to be convergent. One example is given by quantum electrodynamics, where the 
following picture can be conjectured. To get a well defined quantum field theory, one has to 
introduce magnetic monopoles. These become infinitely heavy when the interaction goes to zero 
and their effects are not included in the perturbation expansion. Since monopoles can have an 
electric charge, one has an additional dimension of the moduli space which cannot be captured by 
perturbation theory. 

In string theory, the role of the magnetic monopoles is taken over by branes of various dimen- 
sions. One can approach the full picture by a description of all possible boundary strata, but this 
goes much beyond the scope of the present article. Nevertheless, the reader should keep in mind 
that the following description of conformal worldsheet physics is perturbative and thus incomplete. 

When a string state is described by a field (p on the worldsheet, the integration over the 
corresponding puncture position takes the form f <p(z) dz dz. This must make sense independently 
of the choice of the coordinate z. In other words, string states are described by primary fields 
of conformal dimensions (1,1). There is another way to get the same result. When the string is 
considered in the background of some particle wave in spacetime, this yields a conformally invariant 
deformation of the theory, at least infinitesimally. Since deformations are described by the primary 
fields of conformal dimensions (1,1), the same must be true for the particle states arising from the 



594 


WNahm 


string. With reference to spontaneous symmetry breaking in quantum field theory, the existence 
of particle states may be described as a Goldstone phenomenon. 

When one considers strings in flat spacetime, the coordinates of the latter can be regarded 
separately. For a space coordinate appropriate fields are given by ex'p{ipiX^ (z)) , with arbitrary 
Pi. With a conventional choice of the length scale, the scaling dimension of this field is pf/4. 

A new situation appears for the time coordinate X^. Due to Lorentz invariance, the field 
exp{ipoX^{z)) has scaling dimension Fields with negative scaling dimensions of arbitrary 

size do not occur in statistical mechanics, but they can be made to fit in the framework of conformal 
field theory. Indeed, without such negative contributions to the scaling dimension, one never would 
get an infinite number of particle states. Here we can take an arbitrary field with h = h' and adjust 
the value of Po such that the scaling dimension becomes 2. This produces at least a (1,1) field, 
though in general it will not be primary. 

When we disregard the latter problem, we can consider the fields dX^dX^ exp{ipX). They 
have conformal dimensions (1,1) when p^ = 0, such that they describe massless particles. When 
one considers their behaviour under spacetime rotations, one sees that they include spin 2 particles, 
i.e. states which behave like gravitons. For general reasons, the coupling of such states must be 
described by Einstein’s theory. Thus any consistent string theory is a theory of quantum gravity. 

Later, this fact was recognized as the best feature of string theory, but it was a nuisance 
long as the theory was supposed to work for the strong interaction. Other problems of the original 
string theory had to be solved quite apart of this deeper issue, namely the existence of tachyons 
and the wrong dimension of spacetime. 

A tachyon appears when one considers the simple field exp( 2 pAr). This is a primary (1,1) field, if 
p 2 _ ^Yiis unwanted particle with negative squared mass, the conformal symmetry 

had to be extended to a superconformal one. The fields of such theories can have integral or half- 
integral conformal spin. Those with a half-integral diflFerence h — are fermionic. In addition to 
the Virasoro fields one has fields G and G of conformal dimensions (3/2,0) and (0,3/2). There are 
two different fermion numbers associated to the holomoirphic and anti-holoinorphic variables. The 
short distance expansion with G changes the first one by one unit, that with G the second one. The 
Fourier components Ln of the Virasoro field and those of G together yield a superalgebra, in which 
the Virasoro algebra is embedded. The model has two sectors (discovered separately by Ramond 
and by Neveu and Schwarz), but we shall consider just the latter. In this sector, the fermionic 
fields have half-integral coefiicients. Apart from such modifications, superconformal field theory 
can be regarded as a special case of conformal field theory, so most of the preceding description 
remains valid. 

Fields related by the action of Gi/25<5i/2 are called superpartners. For superconformal de- 
formations, the corresponding (1,1) fields must be superpartners of (1/2, 1/2) fields. The physi- 
cally relevant deformations are described by bosonic fields, such that the (1/2, 1/2) fields must be 
fermionic with respect to both fermion numbers. The superstring still has fields exp(ipA’), which 
have conformal dimensions (1/2, 1/2) for p^ = —4, but these fields are of bosonic nature and do not 
correspond to physical particles. This elimination of the tachyonic fields is due to Gliozzi, Olive 
and Scherk. 

The issue of the spacetime dimension arose in a different way. When one calculates the norm 
of a field of type dX^, Lorentz invariance yields a result proportional to In particular, one 
can find negative norms which are incompatible with a probability interpretation. In the 50’s and 
60’s much ink had flown in unsuccessful attempts to make sense out of negative norms and no 
one was motivated to try again. Fortunately, Virasoro recognized that not all fields yield physical 
stat^ [1970]. The concepts of primary fields and conformal dimensions did not exist yet, but he 
only found the correct constraints and described them by the Fourier modes of the Virasoro fields. 
One year later, Galli obtained the interpretation in terms of conformal invariance [1970]. 

Numerical investigations showed up to a certain degree of complexity that the physically allowed 
fields all have positive norm, but a general proof was difficult to obtain. Then it turned out that 



Conformal Field Theory 


595 


allowed negative norm fields do exist when the spacetime dimension is greater than 26, or 10 for the 
superstring. This made sense of an observation of Lovelace [1971], which had not been taken very 
seriously because it was too outlandish. Looking at Riemann surfaces of torus topology, Lovelace 
had shown that the bosonic string theory was found to require a spacetime of 26 dimensions. Now 
it became clear that this number was a deep structural property of the bosonic string theory and 
would not go aw^ay. Indeed, Brower [1972] and Goddard and Thorn [1972] used the 26 dimensions 
to prove that the norms make physical sense (the no-ghost theorem). Later it turned out that the 
value of this critical dimension has deep relations to the conformal invariance of the w'orld sheet 
physics and the corresponding modular invariance [Brink, Nielsen 1973]. Moreover, Beilinson and 
Manin found out that the strange 26 was closely related to analytic torsion results of Mumford, 
which allowed them to write the measure for the integration over the moduli space of Riemann 
surfaces in a very elegant form [1986]. 

The critical dimension translates into the value c = c = 26 of the central extensions. For the 
superstring one needs 10 dimensions and c = c= 15. The latter value is due to the superpartners of 
the 10 coordinates which contribute half as much to the central extension. The simplest way to 
obtain a model in four dimensions is the old Kaluza-Klein idea. One just wraps up all superfluous 
dimensions in a small circle. For the bosonic string this yields 22 copies of the Thirring model. 
The corresponding 44 currents of type j and j yield 44 photons, all with separate interactions of 
electromagnetic type. The values of c, c do not change. Obviously, this model is not particularly 
realistic. It exemplifies, however, that the spacetime dimension of the model can be changed at will, 
as long as one keeps conformal invariance and the correct central extensions. For the superstring, 
similar remarks apply. 

To write down a general bosonic string model in four dimensions, one just needs to replace 
the 22 copies of the Thirring model by an arbitrary conformal field theory with c = c = 22. 
The latter is called the internal conformal theory. In analogy to the Kaluza-Klein case, one 
still says that it describes 22 compactified dimensions, even if this is not always a geometrically 
correct interpretation. To compactify the superstring to fourdimensional spacetime, one needs six 
compactified dimensions and c = c = 9 instead. Every possible compactification corresponds to 
a theory in a space of less than 10 spacetime dimensions. In this way one gets, e.g., an encoding 
of four-dimensional quantum field theories by conformal or superconformal field theories in two 
dimensions. 

In particular, consider a field (p{z) exp(tXp), where (j> belongs to the internal theory. When one 
adjusts to get overall conformal dimensions (1,1), one sees that for a particle state of mass m the 
corresponding field must have contributions /i = /i' = lH-m^/8 from the internal conformal theory. 
Of particular interest are the internal (1,1) fields, which correspond to massless Higgs bosons. 

When the conformal theory includes an affine Kac-Moody algebra with holomorphic currents 
ja of conformal dimensions (1,0), the fields jadX^ exp{iXp) with = 0 describe the quanta of 
a vector potential belonging to the>corresponding finite dimensional gauge group. Thus the 
states of the string now include non-abelian gauge fields, and the theory starts to look a bit more 
like the standard model. In the superstring theory, one also gets fermions. Their interactions with 
the Higgs bosons and the gauge fields are of standard type, though one has not yet managed to 
obtain precisely the standard model. 

Of course, the bosonic model always will have the tachyonic exp{ipX) fields and cannot be 
used by itself. Nevertheless, the bosonic string can be used for either the holomorphic or the 
anti-holomorphic coordinates. To get rid of the tachyon, it is indeed sufficient to use a field G but 
no G. This yields models with c = 26 but c = 15, called heterotic string models. They were found 
by Gross, Harvey, Martinec and Rohm. Heterotic strings do not have a pure spacetime version, 
since the spacetime contributions to c and c have to match. The archetypal heterotic string lives 
in 10 dimensions, where the compactified part is purely holomorphic, with c = 16. There are many 
constraints on purely holomorphic conformal theories which exist on arbitrary Riemann surfaces. 
In particular, c must be a multiple of 8. For c = 8, the only example is the affine Kac-Moody 
algebra based on For c = 16, one either can take the tensor product of two c = 8 models or use 



596 


WNahm 


the affinp Kac-Moody algebra based on 50(32). For c = 24, there are 71 possibilities [Schellekens 
1993]. One of them has a remarkable symmetry group of about 10®^ elements, the Fischer-Griess 
monster [Borcherds 1986]. This closeness of string theory to beautiful exotic structures is still a 
deep mystery. To get to four dimensions, one needs a less exotic internal conformal field theory 
with c = 22 and c = 9. 

The late discovery of the heterotic string was due to the fact that string research slowed down 
a lot after 1974. At that time it had become clear that QCD is a better theory for the strong 
interaction. Though Scherk and Schwarz had shown that one could reinterpret string theory as a 
theory of quantum gravity [1974], there was not very much support for such an arcane research 
line. One of the ideas which appeared shortly before the theory entered a long hibernation period 
was the classification of the possible rational theories by modular forms. In particular, this yielded 
a candidate which later would be interpreted as the partition function of the compactified part of 
the heterotic string [Nahm 1977]. Unfortunately, present mathematical techniques only allow to 
apply this procedure to rational conformal fields theories, for which the partition function can be 
written as a finite sum Z ZiZi with holomorphic functions Zi and anti-holomorphic functions 
Zi. Nevertheless, it was striking that the method indicated an incredibly large number of possible 
theories, in stark contrast to the initial hopes that one was heading for something unique. At 
present, the situation has not changed very much. There are vague hopes that non-perturbative 
string theory will select particular models, but it also is possible that one will end up on a moduli 
space with more parameters than in the standard model. For every choice of parameters, one will 
have a quantum version of Einstein’s gravity theory, however. 

The bold switch firom the interpretation of string theory as a theory of the strong interaction, to 
a theory of quantum gravity by Scherk and Schwarz must have been one of the strangest events in 
the history of physics. In particular, the basic distance scale had to be changed by twenty orders 
of magnitude, from the proton diameter to the Planck length. For mathematicians this should be 
easier to digest than for physicists, since no change in the mathematical structure is involved. For 
physicists, however, the emergence of string theory now appears as an accident. It would not have 
happened if the discovery of the SU(3) gauge interaction of the quarks had come a few years earlier. 
Even in hindsight, one sees no way how a direct study of quantum gravity could have led to this 
theory. Indeed, a direct attack has been tried from several points of view, but with very limited 
success. It seems that one cannot unify quantum theory and gravitation without incorporating 
much knowledge about other interactions. 

In any case, present research on quantum gravity cannot follow the traditional pattern of 
physics. One hundred years ago, Planck himself estimated its characteristic length scale by combin- 
ing Newton’s constant, the speed of light, and his new quantum of action. He found 4, 13 • 

[19€0]- At that time, physicists and chemists were getting the first precise ideas about what hap- 
pens at 10“^cm, so Planck must have felt like looking into an abyss. Of the 25 orders of magnitude 
to be covered we now have explored not quite ten, thus a naive extrapolation predicts another 
150 years before we really understand what is going on at the basic scale. Planck’s report about 
his discovery is brief and sober. Nevertheless, he states that the units he found would keep their 
meaning for all times and all cultures, including extraterrestrian and non-human ones. 

Without the ability to do experiments in quantum gravity, it is hard to know if theoretical 
investigations are on the right track. Everyone who keeps trying is inspired by Einstein’s success 
with general relativity. He had little experimental input, but relied on his keen sense for structure 
and mathematical beauty. His belief in the harmony of the spheres was as deep as Kepler’s, and 
when he had found an indication of congruence between nature and a mathematical structure he 
did everything to uncover it fully. Quantum mechanics remained as .a jarring note like the irrational 
numbers to the early Pythagoreans. Thus string theory would have disappointed Einstein as far 
as quantum physics is concerned. But if it is correct, it justifies some of his attempts in the search 
for a unified theory. On one hand, he wanted to generalize the metric tensor g^p to an object with 
an antisymmetric part. String theory has such object, called the B field. Together with the 
metric tensor it is obtained from the fields dX^dX^ exp(ipX) which have been considered above. 



Conformal Field Theory 


597 


Einstein also was right in his high regard for the Kaluza-Klein approach. 

Einstein’s example can be used as an encouragement and as a warning. His successful gravity 
theory was based on at least one elementary fact which no one else could explain - the equality of 
inertial and gravitational mass. When he let loose of such guidance, he still did important research, 
but went astray. String theory does not do too badly on this account. On one hand, one can at 
least come close to the standard model. Moreover, superstring theory at least suggests that the 
experimentalists will find supersymmetry in the near future. The theory develops in a search for 
deep and beautiful structures, but it has the advantage of holding on to the tenuous guide offered 
by low energy experiments. Currently, no other theory of quantum gravity can make such claims. 
Despite its unbelievable origin, string theory is by far the most promising approach to unify all of 
the known interactions. The one possible exception of the latter claim is the cosmological constant, 
since it is separated by another abyss of many orders of magnitude from the rest of physics. 


9 Missed And Open Opportunities 


Let US come back to Dyson’s 1972 address to the American Mathematical Society. It was titled 
’Missed Opportunities’ and concerned problems in the communication between mathematicians 
and physicists. In particular, he considered quantum field theory and the unification with grav- 
ity, but his first example concerned a communication problem between Dyson the physicist and 
Dyson the mathematician. As a mathematician, he had played around with powers of Dedekind’s 
77 -function and obtained nice identities for the exponents 3,8,10,14,15,21,24,26,28,35,36,... In this 
combinatorial context he did not recognize the dimensions of the simple Lie groups, which would 
have been evident to him in a physics context. R:om today’s point of view, the regret about this 
little failure may have caused him to miss a greater opportunity. He must have heard about the 
incredible 26 dimensions of the bosonic string, which Lovelace had foimd the year before, but 
apparently thought little about this coincidence. Otherwise he would have stumbled on the impor- 
tance of Dedekind’s 77 -function in string theory. Two years later Scherk and Schwarz established 
the importance of string theory for the unification of quantum field theory and gravity. 

Many of the present author’s missed and taken opportunities also concern the interaction 
with mathematics. A very rapid course by D. Zagier led to a classification of string theories by 
modular functions. On the other hand, searching the CERN library for books discussing infinite 
dimensional Lie algebras was a frustrating enterprise. Even worse was the inability to find the 
dimensions of the next representations of JSs, when the classification yielded + 248q^'^^ -f — 
Unfortxmately, the visit of Kac to CERN came far too late, but to the author it proved the value 
of an environment where physicists and mathematicians could make the effort to learn about their 
respective discoveries. Princeton and some other places made a good start, but it would be nice 
to have a few more. Here are some problems which may be tackled in such an environment. 

In the Kaluza-Klein formalism, a fifth dimension is hiding because it is compactified to a 
circle. When one considers experiments at fixed energy and makes the period of the compactified 
dimension very large, the five dimensional geometry emerges again. This process can be generalized 
- taking suitable limits of quantum field theories one can obtain classical geometries. In this 
context, the latter are called target spac^. 

Let us look at the Kaluza-Klein situation in the context of string theory. We have considered 
fields exp(iXp), where for simplicity we consider a single position component X. When it is 
compactified with period I, the choice of p is constrained by exp{ilp) = 1, or p == 27tn/l with 
integral n. The scaling dimension of such a field is proportional to (n//)^, thus small for large 
I and small n. In particular, the integer /i - /i' has to vanish. The short distance expansion of 
such fields involves weak singularities only. In the limit where I becomes large, it reduces to the 
ordinary product exp{iXpi) exp{iXp 2 ) = exp(iX{pi 4 -P 2 )). Because of scale invariance, the large 
I limit produces a unique commutative algebra of all fields whose scaling dimension approaches 
zero. 



598 


WNahm 


In the classical limit, every smooth function on the Kaluza-Klein circle can be Fourier expanded 
with basis exp{iXp). Thus one obtains the algebra of all smooth functions on the circle. Moreover, 
the space of these functions is graded by the scaling dimension 2 /i, which is the eigenvalue of the 
Laplace operator. The geometry of the circle can easily be reconstructed from this information. 

This example can be generalized to all kinds of manifolds. Particularly attractive are Calabi- 
Yau manifolds, where one can work with the highly constrained superconformal theories. Much less 
is known about these manifolds than about the circle, for example about their Einstein metrics. 

In these cases, the conformal theories may be easier to control than their classical limits. One 
certainly may hope to obtain the algebra of smooth functions and the corresponding eigenvalues 
of the Laplace operator from the conformal data. 

When the parameters of a conformal field theories are varied, one may obtain quite different 
classical limits. In particular, a connected moduli space may have several different boundary 
components. In this way, it is possible to relate different classical geometries by non-classical 
paths. As a simple example, consider again a string on the KaJuza-Klein circle. There are more 
fields than we have considered so far, since one can wind the string aroimd the circle. When the 
circle is large, this yields particle states of large mass. When the period I becomes very small, 
winding costs hardly any energy, whereas the exp(iXp) fields have large scaling dimensions and 
describe particles of large mass. When I goes to zero, the short distance expansion of the basic 
winding fields is just given by the additive group of the winding numbers. In this way, a Kaluza- 
Klein theory with period I becomes isomorphic to one with period 2“^. This isomorphism is known 
as T-duality. It is one of many dualities which arise in string theory, so the name ’dual model’ used 
around 1970 was quite prescient. Perhaps the most famous of the dualities is mirror symmetry, 
which is a specific property of conformal field theories with a high degree of supersymmetry. 

The winding fields are examples of solitonic objects, since the winding number is time-independent. 
In the euclidean conformal theory, one also has instanton contributions given by a map of the string 
Riemann surface to the target space. The most studied case is the one of embeddings of Riemann 
spheres in algebraic target manifolds, since mathematicians have been much interested in another 
abyss of many orders of magnitude from the rest of physics. 

Missed and open opportunities 

Let us come back to Dyson’s 1972 address to the American Mathematical Society. It was titled 
’Missed Opportunities’ and concerned problems in the communication between mathematicians 
and physicists. In particular, he considered quantum field theory and the unification with gravity, 
but his first example concerned a communication problem between Dyson the physicist and Dyson 
the mathematician. As a mathematician, he had played around with powers of Dedekind’s 77 - 
function and obtained nice identities for the exponents 3,8,10,14,15,21,24,26,28,35,36,... In this 
combinatorial context he did not recognize the dimensions of the simple Lie groups, plus 26, 
which would have been evident to him in a physics context. Prom today’s point of view, the 
regret about this little failure may have caused him to miss a greater opportunity. He must have 
heard about the incredible 26 dimensions of the bosonic string, which Lovelace had found the year 
before, but apparently thought little about this coincidence. Otherwise he would have stumbled 
on the importance of Dedekind’s 77 -function in string theory. Two years later Scherk and Schwarz 
established the importance of string theory for the unification of quantum field theory and gravity. 


Many of the present author’s missed and taJken opportunities also concern the interaction 
with mathematics. A very rapid course by D. Zagier led to a classification of string theories by 
modular functions. On the other hand, searching the CERN library for books discussing infinite 
dimensional Lie algebras was a frustrating enterprise. Even worse was the inability to find the 
dimaisions of the next representations of Es, when the classification yielded + 248g^^^ + . . .. 
Unfortunately, the visit of Kac to CERN came far too late, but to the author it proved the value 
of an environment where physicists and mathematicians could make the effort to learn about their 



Conformal Field Theory 


599 


respective discoveries. Princeton and some other places made a good start, but it would be nice 
to have a few more. Here are some problems which may be tackled in such an environment. 

In the Kaluza-Klein formalism, a fifth dimension is hiding because it is compactified to a 
circle. When one considers experiments at fixed energy and mcikes the period of the compactified 
dimension very large, the five dimensional geometry emerges again. This process can be generalized 
- taking suitable limits of quantum field theories one can obtain classical geometries. In this 
context, the latter are called target spaces. 

Let us look at the Kaluza-Klein situation in the context of string theory. We have considered 
fields exp(zXp), where for simplicity we consider a single position component X. When it is 
compactified with period Z, the choice of p is constrained by exp (Up) = 1, oi p = 2irn/l with 
integral n. The scaling dimension of such a field is proportional to {n/l)'^, thus small for large 
I and small n. In particular, the integer h - has to vanish. The short distance expansion of 
such fields involves weak singularities only. In the limit where I becomes large, it reduces to the 
ordinary product exp{iXpi)exp{iXp 2 ) = exp{iX{pi +P 2 ))- Because of scale invariance, the large 
I limit produces a unique commutative algebra of all fields whose scaling dimension approaches 
zero. 

In the classical limit, every smooth function on the Kaluza-Klein circle can be Fourier expanded 
with basis exp(zXp). Thus one obtains the algebra of all smooth functions on the circle. Moreover, 
the space of these functions is graded by the scaling dimension 2/i, which is the eigenvalue of the 
Laplace operator. The geometry of the circle can easily be reconstructed from this information. 

This example can be generalized to all kinds of manifolds. Particularly attractive are Calabi- 
Yau manifolds, where one can work with the highly constrained superconformal theories. Much less 
is known about these manifolds than about the circle, for example about their Einstein metrics. 
In these cases, the conformal theories may be easier to control than their classical limits. One 
certainly may hope to obtain the algebra of smooth functions and the corresponding eigenvalues 
of the Laplace operator from the conformal data. 

When the parameters of a conformal field theories are varied, one may obtain quite different 
classical limits. In particular, a connected moduli space may have several different boundary 
components. In this way, it is possible to relate different classical geometries by non-classical 
paths. As a simple example, consider again a string on the Kaluza-Klein circle. There are more 
fields than we have considered so far, since one can wind the string around the circle. When the 
circle is large, this yields particle states of large mass. Wlien the period I becomes very small, 
winding costs hardly any energy, whereas the exp{iXp) fields have large scaling dimensions and 
describe particles of large mass. When I goes to zero, the short distance expansion of the basic 
winding fields is just given by the additive group of the winding numbers. In this way, a Kaluza- 
Klein theory with period I becomes isomorphic to one with period This isomorphism is known 
as T-duality. It is one of many dualities which arise in string theory, so the name ’dual model’ used 
around 1970 was quite prescient. Perhaps the most famous of the dualities is mirror symmetry, 
which is a specific property of conformal field theories with a high degree of supersymmetry. 

The winding fields are examples of solitonic objects, since the winding number is time-independent. 
In the euclidean conformal theory, one also has instanton contributions given by a map of the string 
Riemann surface to the target space. The most studied case is the one of embeddings of Riemann 
spheres in algebraic target manifolds, since mathematicians have been much interested in coxmting 
the number of different embeddings. As shown by Candelas’ group, mirror symmetry yields the 
correct numbers for the quintic in four dunensional projective space. This started the huge interest 
of mathematicians in this topic. Usually, mathematicians try to replace the quantum field theoretic 
approach by more classical methods, but in the end this may well turn out to be the more arduous 
approach, quite comparable to a proof of the prime number theorem without using analysis. 

The moduli space of superconformal theories of fixed central extensions seems to be connected. 
For c = c = 9 each Calabi-Yau manifold of three complex dimensions is a possible target space and 



600 


WNabm 


yields one boundaxy component of the moduli space. By following all ramifications of the moduli 
space it should be possible to classify all such Galabi-Yau manifolds, for a start. 

In modern algebraic geometry, geometric and number theoretic problems occur side by side. 
The same is true of conformal field theory, though physicists so far are ill equipped to handle 
these issues. For example, what is the meaning of Dysons formula for For the moment, this 
is a mystery without much of a clue, but one can start with simpler problems. Above, we have 
discussed rational points in the moduli space of string theories. At these points, coupling constants 
like the Yukawa couplings can be calculated explicitly. Usually, the rational models are among the 
few which such calculations are possible at present. For example, let us take models with c = c = 6 
and sufficiently large supersymmetry. In this case one obtains K3 surfaces as target space. The 
moduli space turns out to have 80 dimensions, but the largest submanifold under explicit control 
has just 16 dimensions. In addition, however, there are many rational points sprinkeled around 
which are perfectly well understood. Will the rational points turn out to be densely distributed? 
More specifically, the same moduli space occurs for torus compactifications of the heterotic string 
to six-dimensional flat spacetime. In the latter case the rational points are well known, and they 
correspond precisely to the complex multiplication points of the K3 moduli space. Are those the 
rational points of the latter moduli space? 

In simple examples, the conformal dimensions of rational models are obtained by applying 
the dilogarithm function to algebraic numbers. The corresponding sums are described by torsion 
elements in the Bloch group [Nahm, Recknagel, Terhoeven 1993]. Thus there is a link between 
conformal field theory and one of the most active areas of present mathematical research. In 
particular, there seem to be relations to Grothedieck’s program for a description of the Galois 
group of all algebraic numbers and to the theory of motives. Kontsevich recently conjectured that 
the motivic Galois group acts on the moduli space of conformal field theories [1999]. 

Obviously, mathematicians have much to gain from physics. In view of the higher reliability of 
the answer (and regarding costs as irrelevant) physicists were more inclined to ask nature than to 
ask a mathematician. Quite typically, Schweber’s textbook concludes with the following sentences: 
”In the final analysis, however, it will probably be the new information that will be obtained from 
the high energy machines and colliding beam machines to go into operation in the next few years 
which will help unravel the puzzle of the elementary particles and their interactions. In particular, 
we may discover whether the notions of space and time upon which present-day field theories are 
based are in fact valid.” But meanwhile we have learned more respect for the sixteen orders of 
magnitude which separate us from the Planck scale. If it gets too expensive to ask the direct 
questions, we just have to push the mathematical analysis of what little clues there are. There 
is hope, since sometimes it did work. Kepler managed to extract the secrets of the planetary 
motions from pretelescopic data, but it would have been much harder without some knowledge 
about ellipses. 

As a final encouragement for those willing to use the bridge, let me quote a German poet: 
”Nur Beharrung fiihrt zum Ziele, nur die Fiille fiihrt zur Klarheit und im Abgrund wohnt die 
Wahrheit” (to reach the goal you must be persistent, to see clearly you have to understand a 
wealth of phenomena, and truth lives in the abyss). Schiller’s poem talks about causal time and 
three-dimensional space, but two euclidean dimensions make a good start. 


References 

[1] M.F. Atiyah, N.J. Hitchin, I.M. Singer, Self-duality in four-dimensional Riemannian geometry, 
Proc.Roy. Soc.L. A 362, p. 425, 1978 

[2] H. Bateman, The transformation of the electrodynamic equations, Proc. London Math. Soc. 
8, p. 223, 1910 

[3] E. Bessel-Hagen, fiber die Erhaltungssatze der Elekrodynamik, Math. Ann. 84, p. 258, 1921 



Conformal Field Theory 


601 


[4] A.A. Beilinson, Yu .1. Manin, The Mumford form and the Polyakov measure in string theory, 
Comm.Math. Phys. 107, p. 359, 1986 

[5] A.A. Belavin, A.M. Polyakov, A.B. Zamolodchikov, Infinite conformal symmetry in two- 
dimensional quantum field theories, Nucl.Phys. B241, p. 333, 1984 

[6] L. Brink, H.B. Nielsen, A physical interpretation of the Jacobi imaginary transformation and 
the critical dimension in dual models, Phys.Lett. 43B, p. 319, 1973 

[7] R.C. Brower, Spectrum-generating algebra and no-ghost theorem for the dual model, 
Phys.Rev. D6, p. 1655, 1972 

[8] N. Bohr, L. Rosenfeld, Zur Prage der Messbarkeit der elektromagnetischen Feldgrossen, 
Dan.Math.Fys.Medd. 12, Nr. 8, 1933 

[9] R.E. Borcherds, Vertex algebras, Kac-Moody algebras and the monster, Proc.Nat.Acad.Sci. 
USA 83, p. 3068, 1986 

[10] C.G. Callan, S. Coleman, R. Jackiw, A new improved energy-momentum tensor, Ann.Phys. 
59, p. 42, 1970 

[11] C.G. Callan, R.F. Dashen, D.H. Sharp, Solvable two-dimensional field theory based on cur- 
rents, Phys. Rev. 165, p. 1883, 1968 

[12] A. Cappelli, C. Itzykson, J.-B. Zuber, The A-D-E classification of minimal and Al conformal 
invariant theories, Comm.Math.Phys. 113, p. 1, 1987 

[13] S. Coleman, Quantum sine-Gordon equation as the massive Thirring model. Phys.Rev. Dll, 
p. 2088, 1975 

[14] R.P. Crease and C.C. Mann, The Second Creation, Macmillan, New York 1987 

[15] E. Cunningham, The principle of relativity in electrodynamics and an extension thereof, Proc. 
London Math. Soc. 8, p. 77, 1910 

[16] P. di Francesco, P. Mathieu, D. Senechal, Conformal Field Theory, New York 1997 

[17] H.P. Durr, W. Heisenberg, H. Mitter, S. Schlieder, K. Yamazaki, Zur Theorie der Elemen- 
tarteilchen, Zeits. Naturfor. 14a, p, 441, 1959 

[18] F.J. Dyson, The radiation theories of Tomonaga, Schwinger, and Feynman, Phys.Rev. 75, p. 
486, 1949 

[19] F.J. Dyson, Missed opportunities, Bull. AMS, p. 635, 1972 

[20] S. Ferrara, R. Gatto, A.F. Grille, Conformal algebra in two space-time dimensions and the 
Thirring model, Nuovo Cim. 12A, p. 959, 1972 

[21] D. Friedan, Z. Qiu, S. Shenker, Conformal invariance, unitarity and critical exponents in two 
dimensions, Phys.Rev.Lett. 52, p. 1575, 1984 

[22] A. Galli, Conformal invariance in the dual symmetric theory of hadrons, Nuovo Cim. 69A, p. 
275, 1970 

[23] I.M. Gelfand, D.B. Fuks, Cohomologies of the Lie algebra of the vector fields on the circle, 
Funct-Anal.AppL 2, p. 342, 1968 

[24] M. Gell-Mann, Symmetries of baryons and mesons, Phys. Rev. 125, p. 1067, 1962 

[25] P. Ginsparg, Curiosities at c=l, Nucl.Phys. B295, p. 153, 1988 



602 


WNahm 


[26] P. Goddard, Meromorphic conformal field theory, in: Infinite Dimensional Lie Algebras and 
Lie Groups, V,G. Kac ed., Adv.Ser.Math.Phys. 7, p. 556, World Scientific, 1989 

[27] P. Goddard, C.B. Thorn, Compatibility of the dual Pomeron with unitarity and the absence 
of ghosts in the dual resonance model, Phys.Lett. 40B, p. 235, 1972 

[28] P. Goddard, D. Olive, eds., Kac-Moody and Virasoro Algebras, World Scientific, Singapore 
1988 

[29] M. Green, J. Schwarz, E. Witten, Superstring Theory, Cambridge University Press 1987 

[30] F. Giirsey, On a conform-invariant spinor wave equation, Nuovo Cim. 3, p. 988, 1956 

[31] R. Haag, On quantum field theory, Dan.Math.Fys.Medd. 20, p. 12, 1955 

[32] E.M. Henley, W. Thirring, Elementary Quantum Field Theory, McGraw-Hill, New York 1962 

[33] W.V.D. Hodge, The Theory and Application of Harmonic Integrals, Cambridge Univ. Press 
1941 

[34] C. Itzykson, H. Saleur, J.-B. Zuber, eds.. Conformal Invariance and Applications to Statistical 
Mechanics, World Scientific, Singapore 1988 

[35] K. Johnson, Solution of the equation for the Green’s functions of a two dimensional relativistic 
field theory, Nuovo Cim. 20, p. 773, 1961 

[36] V.G. Kac, Contravariant form for infinite dimensional Lie algebras and superalgebras, Lecture 
Notes in Physics 94, p. 441, Springer 1979 

[37] L.P. Kadanoff, Scaling laws for Ising models near Tc, Physics 2, p. 263, 1966 

[38] L.P. Kadanoff, Operator algebra and the determination of critical indices, Phys.Rev.Lett. 23, 
p. 1430, 1969 

[39] 0. Klein, On the theory of charged fields, in: New Th'?ories in Physics, Warsaw 1938, Proc., 
Nyhoff, The Hague 1939 

[40] M. Kontsevich, Operads and motives in deformation quantization, math.QA/9904055 

[41] J. Lepowsky, R.L. Wilson, Construction of the aflBne Lie algebra Comm.math.Phys. 62, 
p. 43, 1978 

[42] C. Lovelace, Pomeron form factors and dual Regge cuts, Phys.Lett. 34B, p. 500, 1971 

[43] J. Maldacena, The large N limit of superconformal field theories and supergravity, 
Adv.Theor.Math.Phys. 2, p. 231, 1998 

[44] T. Matsubara, A new approach to quantum-statistical mechanics, Prog.Theor.Phys. 14, p. 
351, 1955 

[45] W. Nahm, Spin in the spectrum of states of dual models, Nucl.Phys. B120, p. 125, 1977 

[46] W. Nahm, A. Recknagel, M. Terhoeven, Dilogarithm identities in conformal field theory, 
Mod.Phys.Lett. !8, p. 1835, 1993 

[47] Y.J. Ng; Julian Schwinger, the Physicist, the Teacher, and the Man, World Scientific, Singa- 
pore 1996 

[48] E. Noether, Invarianten beliebiger Difierentialausdriicke, Nachr. d. Gottinger Akad. d. Wiss. 
1918, p. 235 



Conformal Field Theory 


603 


[49] L. Onsager, A two-dimensional model with an order-disorder transition, Phys.Rev. 65, p. 117, 
1944 

[50] K. Osterwalder, Euclidean Green’s functions and Wightman distributions, in: Constructive 
quantum field theory, Lecture Notes in Physics 25, Springer 1973 

[51] A. Pais, Inward Bound, Oxford University Press, 1986 

[52] W. Pauli, Uber die Invarianz der Dirac’schen Wellengleichungen gegeniiber 
Ahnlichkeitstransformationen des Linienelementes im Fall verschwindender Ruhmasse, 
Helv.Phys.Acta 13, p. 204, 1940 

[53] M. Planck, Uber irreversible Strahlungsvorgange, Annalen d. Physik 1, p. 69, 1900 

[54] J. Polchinski, String Theory, Cambridge University Press 1998 

[55] A.M. Polyakov, Conformal symmetry of critical fluctuations, ZheTF Pis. Red. 12, p. 538, 1970 

[56] A. Pressley, G. Segal, Loop Groups, Clarendon Press, Oxford 1986 

[57] K.-H. Rehren, Algebraic holography, hep-th/9905179 

[58] A. Salam, The field theory of superconductivity, Prog.Theor.Phys. 9, p. 550, 1953 

[59] A.N. Schellekens, On the classification of meromorphic c=24 conformal field theories, Theor. 
Math.Phys. 95, p. 632, 1993 

[60] J. Scherk and J.H. Schwarz, Dual models for non-hadrons, NucL Phys. B81, p. 118, 1974 

[61] S. Schweber, An Introduction to Relativistic Quantum Field Theory, Harper & Row, New 
York 1964 

[62] J. Schwinger, ed., Quantum Electrodynamics, Dover, New York 1958 

[63] J. Schwinger, Euclidean quantum electrodynamics, Phys. Rev. 115, p. 721, 1959 

[64] J. Schwinger, Renormalization theory of quantum electrodynamics: An Individual View, in: 
The Birth of Particle Physics, Fermi Lab 1980, Proc., L.M. Brown, L. Hoddeson eds., Cam- 
bridge University Press 1983 

[65] G. Segal, Unitary representations of some Infinite dimensional groups, Comm.math.Phys. 80, 
p. 301, 1981 

[66] G. Segal, The definition of conformal field theory, in: Differential Geometrical Methods in 
Theoretical Physics, Como 1987, Proc., K. Bleuler and M. Werner eds., p. 165, NATO ASI 
Series 250, 1988 

[67] T.H.R. Skyrme, A non-linear theory of strong interactions, Proc. R.Soc. A247, p. 260, 1958; 
A262, p. 237, 1961 

[68] R.F. Streater, A.S. Wightman, PCT, Spin and Statistics, and All That, Benjamin, New York 
1964 

[69] W.E. Thirring, A soluble relativistic field theory, Ann.Phys. 3, p. 91, 1958 

[70] M.A. Virasoro, Subsidiary conditions and ghosts in dual resonance models, Phys. Rev. Dl, p. 
2933, 1970 

[71] M.A. Virasoro, Spin and unitarity in dual resonance models, in: Duality and Symmetry in 
Hadron Physics, Proc., E. Gotsman ed., Td Aviv 1971 

[72] J. Wess, The conformal invariance in quantum field theory, Nuovo Cim. 18, p. 1086, 1960 



604 W Nahm 


[73] K.G. Wilson, Non-Lagrangian models of current algebra, Phys. Rev. 179, p. 1499, 1969 

[74] K.G. Wilson, Anomalous dimensions and the breakdown of scale invariance in perturbation 
theory, Phys. Rev. D2, p. 1478, 1970 

[75] E. Witten, Anti de Sitter space and holography, Adv.Theor.Math.Phys. 2, p. 253, 1998 

[76] C.N. Yang, Einstein and his Impact on the physics of the second half of the twentieth century, 
in: M. Grossmann Meeting on General Relativity, 2nd, Trieste 1979, Proc., R. Ruffini ed., 
North-Holland 1982 

[77] A.B. Zamolodchikov, Infinite additional symmetries in two-dimensional conformal quantum 
field theories, Theor.Math.Phys. 63, p. 1205, 1985 



23. Superstring Theory — An Overview 


John H. Schwarz * 

California Institute of Technology Pasadena, CA 91125, USA 


Abstract 

Superstring theories and a recent extension called M theory are different facets of a unique 
underlying theory. They are the leading caindidates for a quantum theory that unifies gravity 
with the other forces. As such, they axe certainly not ordinary quantum field theories. How- 
ever, recent duality conjectures suggest that a more complete definition of these theories can 
be provided by the large N limits of suitably chosen U{N) gauge theories associated to the 
asymptotic boundary of spacetime. 


Introduction 

Superstring theory first achieved widespread acceptance during the first superstring revolution in 
1984-85. There were three main developments at this time. The first was the discovery of an 
anomaly cancellation mechanism [1], which showed that supersymmetric gauge theories can be 
consistent in ten dimensions provided they are coupled to supergravity and the gauge group is 
either SO(32) or E^x Eg- Any other group necessarily would give uncanceled gauge anomalies and 
hence inconsistency at the quantum level. The second development was the discovery of two new 
superstring theories — called heterotic string theories — with precisely these gauge groups [2]. The 
third development was the realization that the Es x Es heterotic string theory admits solutions in 
which six of the space dimensions form a Calabi-Yau space, and that this results in a 4d efiFective 
theory at low energies with many qualitatively realistic features [3]. Unfortunately, there are very 
many Calabi-Yau spaces and a whole range of additional choices that can be made. In any case, 
after the first superstring revolution subsided, we had five distinct superstring theories, each in 
ten dimensions. Three of them, the type I theory and the two heterotic theories, have A/" = 1 
supersymmetry in the ten-dimensional sense. The other two theories, called type IIA and type IIB, 
have A/* = 2 supersymmetry [4]. 

The understanding of these five superstring theories was developed in the ensuing years. In 
each case it became clear, and was largely proved, that there are consistent perturbation ex- 
pansions of on-shell scattering amplitudes. In four of the five cases (heterotic and type II) the 
fundamental strings are oriented and unbreakable. As a result, these theories have particularly 
simple perturbation expansions. Specifically, there is a unique Feynman diagram at each order 
of the loop expansion. The Feynman diagrams depict string world sheets, and therefore they are 
two-dimensional surfaces. For these four theories the unique i-loop diagram is a closed orientable 
genus- Z/ Riemann surface, which can be visualized as a sphere with L handles. External (incoming 
or outgoing) particles are represented by points (or “punctures”) on the Riemann surface. A 
given diagram represents a well-defined integral of dimension 6L + 2N — 6. This integral has no 
ultraviolet divergences, even though the spectrum contains states of arbitrarily high spin (including 
a massless graviton). Type I superstrings are unoriented and breakable. As a result, the pertur- 
bation expansion is more complicated for this theory, and the various world-sheet diagrams at a 
given order have to be combined properly to cancel divergences and anomalies. 

*E.mail: JHS@THEORY.CALTECH.EDU 



606 


John H Schwarz 


M Theory 

In the 1970s and 1980s various supersymmetry and supergravity theories were constructed. (See [5], 
for example.) Ten is the largest spacetime dimension in which there exists supersymmetric Yang- 
Mills theories, with spins < 1 [6], and the largest possible spacetime dimension for a supergravity 
theory (with spins < 2), is eleven. Eleven-dimensional supergravity [7] has three kinds of fields — the 
graviton field (with 44 polarizations), the gravitino field (with 128 polarizations), and a three-index 
antisymmetric tensor gauge field C^i/p (with 84 polarizations) . These massless particles are referred 
to collectively as the supergraviton, lid supergravity, which has attracted a lot of attention over 
the years is nonrenormaJizable, and thus it is not a consistent quantum theory. However, we now 
believe that it is a low-energy effective description of M theory, which is a consistent quantum 
theory [8, 9]. 

To explain the relation between M theory and type IIA string theory, a good approach is to 
identify the parameters that characterize each of them and to explain how they are related. Eleven- 
dimensional supergravity (and hence M theory, too) has no dimensionless parameters. The only 
parameter is the lid Newton constant, or (equivalently) the lid Planck mass rup. When M theory 
is compactified on a circle (so that the spacetime geometry is x S^) another parameter is, the 
radius R of the circle. The parameters of type IIA superstring theory are the string mass scale 
and the dimensionless string coupling constant Qg. An important fact about all five superstring 
theories is that the coupling constant is not an arbitrary parameter. Rather, it is a dynamically 
determined as the value of a scalar field, called the dilaton, which is a supersymmetry partner of 
the graviton. 

We can identify compactified M theory with type IIA superstring theory by making the following 
correspondences: 

= 27rRm^ ( 1 ) 

gs-2irRm8. ( 2 ) 

Conventional string perturbation theory is an expansion in powers of gs at fixed Equation (2) 
shows that this is equivalent to an expansion about i? = 0. In particular, the strong-coupling limit 
of type IIA superstring theory corresponds to decompactification of the eleventh dimension, so in a 
sense M theory is type IIA string theory at infinite coupling. (The Es x Es heterotic string theory 
is also eleven-dimensional at strong coupling [10].) This explains why the eleventh dimension was 
not discovered in studies of string perturbation theory. These relations encode some interesting 
facts. For example, the fundamental IIA string actually is a two-dimensional membrane (called the 
M2-brane) of M theory with one of its dimensions wrapped around the circular spatial dimension. 
Denoting the string and membrane tensions (energy per unit volume) by T^i and Tm 2 ^ one deduces 
that = 2'kRTm2^ However, Tpi = 27rm^ and Tm 2 = Combining these relations gives 

eq. (1). It should be emphasized that all these formulas are exact, due to the large amount of 
unbroken supersymmetry. 

Type II superstring theories contain a variety of p-brane solutions that preserve half of the 32 
supersymmetries [11]. These are solutions in which the energy is concentrated on a p-dimensional 
spatial hypersurface. (Adding the time dimension, the world volume of a p-brane has p + 1 dimen- 
sions.) A large dass of these p-brane excitations are called D~branes (or Dp-branes when we want 
to specify the dimension), which have a number of special properties and are especially interesting. 
By definition, they are branes on which strings can end — D stands for Dirichlet boundary condi- 
tions. The end of a string caxri^ a charge, and the D-brane world- volume theory contains a U{1) 
gauge field that carries the associated flux. When N Dp-branes are coincident, or parallel and 
nearly coinddent, the associated (p+ l)-dimensional world- volume theory is a U{N) gauge theory. 
The gauge fields and their supersymmetry partners arise as the ground states of oriented 
strings rimning from the ith Dp-brane to the jth Dp-brane. The diagonal elements, belonging to 
the Cartan subalgebra, are massless. The field with i / j has a mass proportional to the 
separation of the ith and jth branes. This separation is described by the value of a corresponding 
scalar field in the world- volume theory. 



Superstring Theory 


607 


Some D-branes have a simple M theory interpretation. In particular, the D2-brane of the type 
IIA theory corresponds to the M2-brane, but now in a background geometry in which one of the 
transverse dimensions is a circle. The first Kaluza-Klein excitation of the lid supergraviton has 
mass l/R, It can be identified with the DO-brane, which accounts for eq. (2). More identifications 
of this type arise when we consider the magnetic dual of the M theory supermembrane. It turns 
out to be a five-brane, called the M5“brane. (In general, the magnetic dual of a jp-brane in d 
dimensions is a (d-p — 4)-brane.) For example, wrapping one of the M5-brane dimensions around 
the spatial circle gives the D4-brane. 

AdS/CFT Duality 

Let me now turn to an even more recent development, which goes by the name of AdS/CFT duality. 
Here, AdS stands for anti de Sitter space and CFT stands for conformal field theory, AdS/CFT 
duality was proposed by Maldacena in November 1997 [13]. As is usually the case with such 
developments, there were important prior [14] and subsequent [15] contributions by many others. 

A p>brane, or collection of p-branes, gives rise to a certain space-time geometry and gauge 
field configuration, which can be analyzed using the appropriate supergravity field equations. In 
a number of cases one finds that the geometry has an event horizon, giving a higher-dimensional 
analog of black holes. In some of these cases the geometry near the horizon is approximated by 
AdSpj ^2 X This means that the AdS space has p 4- 2 dimensions and the remainder of 

the d dimensions form a sphere of d — p - 2 dimensions. There are three basic examples that 
have maximal supersymmetry (32 conserved supercharges). A stack of D3 branes in type IIB 
superstring theory has near horizon geometry AdS^ x S^, d, stack of M2-branes in M theory gives 
AdS 4, X 5^, and a stack of M5-branes in M theory gives AdSj x S^. These solutions to type IIB 
and lid supergravity were discovered in the mid 1980’s, but were not pursued in the context of 
superstring/M theory until recently. 

The basic idea of AdS/CFT duality is to identify a conformally invariant field theory (CFT) 
on the n-dimensional boundary with a suitable quantum gravity theory in the (n -1- l)-dimensional 
AdS hulk. The 50(2, n) isometries of the (n + l)-dimensional anti de Sitter space induce the group 
of conformal transformations on its n-dimensional Minkowski boundary. (Strictly speaking, the 
boundary should be compactified by adding a point at infinity.) The conformal group is therefore 
also 50(2, n). 

To be specific, from now on I will focus on the Ad55 x 5^ solution of the IIB superstring theory. 
The IIB theory contains a four-index field A^vp\ for which the D3-brane is a source. It has a field 
strength Fp,upx<n which is self-dual (in ten dimensions). In the AdSs x 5® solution of the theory, 
the field F has a quantized flux on the sphere. Schematically, 

f F = N, ' (3) 

where iV is a positive integer. This integer determines the radius R of the AdSs and of the 5®, 
which are the same. Aside from a constant numerical factor, one finds that 

R = {9sNy*£,. (4) 

Thus the curvatures (which are proportional to R"^) are small compared to the string scale for 
QsN > 1 and small compared to the Planck scale for iV > 1. The first limit suppresses stringy 
corrections to supergravity, whereas the latter suppresses quantum corrections to classical string 
theory. The conjecture is that type IIB superstring theory on AdSs x 5^ with N units of F flux 
is equivalent to Jf = A, D Z + 1 U{N) Yang-Mills theory with Qym = For this conjecture to 
be plausible, it is a crucial fact the = 4 super Yang-Mills theory is a CFT with vanishing beta 
function, a fact that was proved in the early 1980s. This duality — if true — implies an amazing fact: 
the 4d gauge theory, for large iV, is actually a lOd string theory! Well, it is not yet “proved,” but 
the evidence is mounting rapidly. For example, the symmetries match: the two dual theories have 
the same symmetry supergroup SU (2, 2|4). 



608 


. John H Schwarz 


The AdS/CFT duality conjecture has been made more precise in [15]. These papers have 
proposed a mapping between the bulk string theory and the boundary gauge theory. It gives a one- 
to-one correspondence between on-shell particles of the bulk theory and gauge-invariant operators 
of the boundary theory. Moreover, correlation functions of these gauge-invariant operators are 
related to the response of the type IIB theory to boundary conditions for the associated fields. 
These correspondences have been partially verified. For example, there is a perfect correspondence 
between particles belonging to short representations of the AdS supersymmetry algebra and chiral 
primary operators of the gauge theory. 

The large N limit of SU{N) gauge theories for fixed A = was studied long ago by T 

Hooft [16]. He showed that only Feynman diagrams of planar topology contribute in this limit. 
Moreover, he conjectured that the theory should exhibit a stringy behavior in this limit. Now, this 
suggestion has been made precise. In principle, the complete A depepdence of AT = 4 gauge theory 
in the ‘t Hooft limit should be given by classical type IIB superstring theory on AdS^ x 5®. Many 
people are currently studying this. 

An important concept that has emerged in recent years, called the holographic principle [17], is 
incorporated by AdS/CFT duality. This concerns the number and location of degrees of freedom 
in a theory. In a local quantum field theory, the locality implies that the number of degrees of 
freedom in a spatial region is proportional to its volume. However, this cannot be correct for a 
quantum gravity theory, where the maximum entropy in a region is proportional to its surface 
area. (This bound is saturated in the case of a black hole.) So the idea of the holographic principle 
is that the physics in a region of space can be encoded holographically on a surface that surrounds 
it. This is what happens in the case of AdS/CFT duality. The physics of the AdS bulk (given by 
superstring theory) is not a local QFT; rather, it is projected onto the boundary theory, which is 
a local QFT. 


Important Unresolved Issues 

One issue that needs to be settled if superstring theory is to be used for phenomenology is where 
supersymmetry fits into the story. It is clear that at the string scale (« 10^^ GeV) the underlying 
theory has maximal supersymmetry (32 conserved supercharges). The question that needs to be 
answered is at what scales they are broken and by what mechanisms. The traditional picture 
(which looks the most plausible to me) is that at the compactification/GUT scale (?« 10^^ GeV) 
the symmetry is broken to AT = 1 in d = 4 (four conserved supercharges), and this persists to 
the TeV scale, where the final susy breaking occurs. The TeV scale is indicated by three separate 
arguments: the gauge hierarchy problem, supersymmetric grand unification, and the requirement 
that the lightest superparticle (LSP) be a cosmologicaJly significant component of dark matter. - 
It would be astonishing if this coincidence turned out to be a fluke. Susy GUTS are also able to 
account for the masses of the top and bottom quarks and th^ structure of electroweak symmetry 
breaking. D^pite all these indications, we cannot be certain that this picture is correct until it is 
demonstrated experimentally. As I once told a newspaper reporter: discovery of supersymmetry 
would be more profound than life on Mars. 

Another important issue is the problem of vacuum degeneracy and the stabilization of moduli. 
Let me explain. The underlying theory is completely unique, with no dimensionless parameters. 
Neverthel^, the effective potential of typical string theory quantum vacua has many flat directions, 
so there is a continuum, or moduli space^ of minima. This results in parameters, called moduli^ 
which characterize the vacuum values of scalar fields. These fields correspond to massless spin zero 
particles. Notable example of moduli are the sizes of extra dimensions and the string coupling 
constant. Th^ spin zero particles typically interact with roughly gravitational strength, which is 
a problem because the gravitational force is observed to be pure tensor to better than 1% accuracy. 
So it seems that we should seek a vacuum without moduli, which is very difficult to do. However, 
if a vacuum without massless scalars is ever found, it will not have any continuously adjustable 
parameters, and therefore it will be completely predictive (at least in principle). 

Perhaps the mc^t challenging unresolved issue of all, is the cosmological constant A. This is a 



Superstring Theory 


609 


term in the effective action that describes the energy density of the vacuum, which is observable in 
a gravitational theory. Observationally, there are indications that it may be nonzero, but in any 
case it is extremely small compared to ordinary particle physics scales. < 10“"^^ GeV.) In a 
fundamental theory A receives contributions from many sources such as vacuum condensates and 
zero-point energies. Supersymmetry ensures that boson and fermion zero-point energies cancel, 
so the natural scale for A would seem to be the TeV susy breaking scale, which is many orders 
of magnitude too high. This is a fine-tuning problem that is reminiscent of the gauge hierarchy 
problem. Presumably string theory will provide an elegant solution. Until we know what the right 
mechanism is, it is hard to be confident that there is not an alternative to supersymmetry for 
solving the gauge hierarchy problem. I believe that when the correct solution to the problem of 
the cosmological constant is found, it will spark another revolution in our understanding. 


Conclusion 

To conclude, there has been dramatic progress in understanding string theory in the past few 
years. The discovery of nonperturbative phenomena, M theory, D-branes, and AdS/CFT duality 
have led to many important advances including an explanation of black hole entropy and exact 
nonperturbative results in suspersymmetric gauge theories. Further theoretical breakthroughs are 
still needed, however. Also, future experimental discoveries will be essential to help guide our 
thinking. Sooner (Tevatron or LEP) or later (LHC) exciting new phenomena are bound to show 
up. My bet is on Higgs and superparticles. But if I should turn out to be wrong, that would not 
mean that string theory is wrong. 

This work supported in part by the U.S. Dept, of Energy under Grant No. DE-FG03-92- 
ER40701. 



610 


John H Schwarz 


References 

[1] Green, M.B., and Schwarz, J.H., Phys. Lett, 149B (1984) 117. 

[2] Gross, D.J., Harvey, J.A., Martinec, E., and Rohm, R., Phys, Rev. Lett 54, (1985) 502. 

[3] Candelas, P., Horowitz, G.T., Strominger, A., and Witten, E., Nucl. Phys. B258 (1985) 46. 

[4] Green, M.B., and Schwarz, J.H., Phys. LeU, 109B (1982) 444. 

[5] Salam, A., and Sezgin, E., eds., Supergravities in Diverse Dimensions^ reprints in 2 vols., 
World Scientific (1989). 

[6] Brink, L., Schwarz, J.H., and Scherk, J., Nucl. Phys. B121 (1977) 77; Gliozzi, F., Scherk, 
J., and Olive, D., Nucl Phys. B122 (1977) 253. 

[7] Cremmer, E., Julia, B., and Scherk, J., Phys. Lett. 76B (1978) 409. 

[8] Townsend, P.K., Phys. Lett. B350 (1995) 184, hep-th/9501068. 

[9] Witten, E,, Nucl Phys. B443 (1995) 85, hep-th/9503124. 

[10] Hofava, P., and Witten, E., Nucl Phys. B460 (1996) 506, hep-th/9510209. 

[11] Horowitz, G.T., and Strominger, A., Nucl Phys. B360 (1991) 197. 

[12] Polchinski, J., Phys. Rev. Lett. 75 (1995) 4724, hep-th/9510017- 

[13] J.M. Maldacena, Adv. Theor. Math. Phys. 2 (1998) 231, hep^th/9711200. 

[14] I.R. Klebanov, Nucl Phys. B496 (1997) 231, hep-th/9702076; S.S. Gubser, I.R. Klebanov, 
and A.A. Tse3rtlin, Nucl Phys. B499 (1997) 217, hep-th/9703040; S.S. Gubser and I.R. 
Klebanov, Phys. Lett. B413 (1997) 41, hep-th/9708005; A.M. Polyakov, hep-th/9711002. 

[15] S.S. Gubser, I.R. Klebanov, and A.M. Polyakov, Phys. Lett. B428 (1998) 105, hep- 
th/9802109; E. Witten, Adv. Theor. Math. Phys. 2 (1998) 253, hep-th/9802150. 

[16] G. ’t Hooft, Nucl Phys. B72 (1974) 461. 

[17] G. ’t Hooft, gr-qc/9310026; L. Susskind, J. Math. Phys. 36 (1995) 6377, hep-th/9409089. 



24. Recent Developments In String Theory 

Jnanadeva Maharana * 

Institute of Physics Bhubaneswar - 751005 India 


Abstract 

The purpose of this short review is to present progresses in string theory in the recent 
past. There have been very important developments in our undestanding of string dynamics, 
especially the nonperturbative aspects. In this context, dualities play a cardinal role. The 
string theory provides a deeper understanding of the physics of special class of black holes 
from a microscopic point of view and has resolved several important questions. It is also 
recognized that M-theory provides a unified description of the five perturbatively consistent 
string theories. The article covers some of these aspects and high lights important progress 
made in string theory. 

Contents 

1. Introduction 612 

2. Perturbative Aspects of String Theory 615 

3. Duality Symmetries in String Theory 624 

4. M-theory and Unified String Dynamics 637 

5. Black holes and String Theory 642 

6. M-theory and the Matrix model 647 

7. Anti-de Sitter Space and Boundary Field Theory Correspondence 652 

8. Summary and Conclusions 658 

9. References 659 


Email:maharana@iopb,r€S.in 



612 


J Maharana 


1 Introduction 

All along the progress in natural philosophy, curious minds have asked deep questions pertaining 
to the fundamental constituents of matter and creation and evolution of the cosmos. In the mod- 
ern era, physicists have endeavored to comprehend natural phenomena in terms of a simple set of 
principles. Therefore, the search has continued to discover the elementary constituents of matter 
and identify the fundamental forces responsible for the natural phenomena. It is accepted that 
there are four fundamental forces : gravitation, the weak interaction, electromagnetism and the 
strong interaction. The unification of fundamental interactions has remained as one of the most 
outstanding challenge for generations of physicists. In the latter half of this century, some progress 
has taken place in this direction through the electroweak unification scheme. The electroweak 
theory together with quantum chromodynamics (QCD), referred to as the standard model, have 
been tested to a great degree of accuracy. Thus, the standard model provides a very good de- 
scription of the ‘low energy physics’, comprising of the spectrum of elementary particles and their 
dynamics. The next step in fulfilling the dream of unification of forces were the schemes of grand 
unifications (GUT) which attempted to incorporate the three fundamental interactions, leaving 
aside gravitational interaction. The QED has been tested to a great degree of accuracy and two 
most important characteristics of that theory are the invariance under local gauge transformations 
and renormalizability. The electroweak theory and QCD respect the principle of gauge invariance 
and are renormalizable. Moreover, it is well known that the Einstein’s theory of general relativity 
respects a local symmetry: invariance under general coordinate transformations. However, the 
theory is not renormalizable since the Newton’s constant carries dimension of (mass)~^, unlike 
the gauge coupling constants of the standard model which are dimensionless. 

Although the standard model has successfully passed many stringent experimental tests, it is recog- 
nised that one must seek for a more fundamental theory. The standard model has many arbitrary 
parameters: the gauge coupling constants, the coupling constants of the scalars, Yukawa couplings 
of the Higgs bosons to fermions which are eventually responsible for generating fermion masses, 
just to mention a few. Furthermore, when one extrapolates the gauge coupling constants utilizing 
the renormalization group equations towards higher energy scale, there are evidences that the three 
coupling constants tend to converge to a point and it is natural to conclude that beyond that scale 
there might be a unified description of the standard model. Therefore, these observations lend 
support to the proposal of GUTS put forward in early seventies. As is well known, the existence of 
electroweak scale in the TeV region and another unification scale in the neighbourhood of 10^® to 
10^^ GeV leads to issues related to fine tuning of parameters, known as gauge hierarchy problem. 
The gauge hierarchy problem can be resolved in an elegant manner if one envisages supersymmetric 
version of the standard model (moreover, the convergence of gauge coupling constants in the unify- 
ing scale is more favourable in supersymmetric theories; see Mohapatra’s article in this volume for 
details). The supersymmetric theories were constructed so that bosons and fermions can belong 
to a supermultiplet. The supersymmetry appeared in 2-dimensions in the construction of string 
theories. While attempts were being made to construct various types of grand unified theories, 
there were developments in incorporating gravity into supersymmetric theories which resulted in 
discovery of supergravity theories. However, it was not possible to construct renormalizable field 
theories which could unify the four fundamental forces. It was being perceived by many physicists, 
in the beginning of eighties, that new radical ideas were required to unify the fundamental inter- 
actions. 

It is now accepted that string theory holds the promise of unifying all the fundamental interactions. 
The progress of the string theory in diverse directions, during the last fifteen years have been truly 
spectacular. The theory has not only has brought us nearer to the dream of unification, but also 
has influenced our understanding of various aspects of quantum gravity and has revealed many 
beautiful features relevant to the nonperturbative aspects of field theories. 

The string theory was invented to describe the dynamics of strongly interacting particles. The vast 
amount of experimental data amassed from high energy accelerators, during fifties and sixties led 
to discovery of large number of hadronic resonances. One of the interesting characteristics of those 
resonances was that when one plots squared of mass vs spins of these particles, families of the res- 



Developments In String Theory 


613 


onances tend to lie on a straight line, known as the Chew-Fraustchi plot. It was also evident from 
the high energy of scattering cross sections of hadrons that they follow a power law behaviour i.e. 
the crossed channel Regge poles controlled cross sections at high energies. The duality relation in 
strong interactions, that is sum over direct channel resonances (from low energy data) reproduces 
the Regge amplitude, was an important discovery for construction of dual models. Veneziano [1] 
took crucial step step when he proposed a four point amplitude which satisfied requirements of 
duality and crossing symmetry. 


T{s,t) = B{s,t) + B{t,u) + B{u,s) 


( 1 . 1 ) 


where 


B{s,t) 


r{-a{s) - a{t)) 


( 1 . 2 ) 


and a{s) = ao + a's is the parametrization of the linear Regge trajectory. Here s,t and u refer to 
the Mandelstam variables; when we are in the center of mass frame, s is the squared of c.ra. energy, 
t and u are related to the c.m, scattering angle. The B-function has the integral representation 


B{s,t)= [ (1.3) 

Jo 

Subsequently, generalized N-point amplitudes satisfying requirements of duality and crossing sym- 
metry were proposed by several authors [2] and one such amplitude is 


F{pi Pn) = 1 ^/ ‘WiiWwij - wjji\\wjji - Wfl j dwi..,dwjvllicj\wi - (1.4) 

Wi are ordered cyclically, vji,wn and wm are any three of the variables of the set {wi}, but are 
held* fixed. As in the case of 4-point Veneziano amplitude, the full N-point amplitude is sum of 
all cyclically inequivalent permutations. It was realized that it is possible to represent the N-point 
function in a path integral form [3] 


Fn^ f n^,<rdX^(<7) [ dwi...dwNexp{-'^ [ (1.5) 

J J ^ J <T2 >0 


where daX^ = and are coordinates of a point in the upper plane, X^{a) are integrated 

over all functions of a. The boundary condition on is d 2 X^ = 0 for = 0. The constant 
T = 2 ^ identified as the tension of the string. Note the presence of X^{wi)] it 

is the v^ue of X^{cr^,cr^) on the line cr^ = = Q. The connection between dual amplitudes 

and dynamics of a relativistic string was recognised by several authors independently [4]. Now, of 
course w’e know that this amplitude is obtained from an open string theory and the action is that 
of a string, there are vertex operator in the path integral formula and the open string boundary 
conditions are to be specified. Virasoro had constructed another 4-point amplitude [5] fulfilling the 
requirement of duality and crossing symmetry in sequel to Veneziano’s paper and generalization 
of that amplitude for N-particle scattering was derived with a path integral representation [6]. It 
was realized that the Virasoro-Shapiro amplitude could be obtained from a closed string theory. 
Finally, Nambu proposed the action for the string so that one could start studying the dynamics 
of the string and proceed to examine the consequences of its quantization. 

The string theory as a theory of strong interaction dynamics was not free from shortcomings. While 
attempts were going on to rectify the pitfalls of the theory and to construct new string theories as 
models of strong interactions; QCD was proposed as the fundamental theory of strongly interacting 
particle. The theory described interactions of the fundamental constituents, quarks, of the hadrons 
with gluon as the carrier of the force. Furthermore, the experimental data confirmed predictions 
of QCD steadily and consequently; string theory as a theory of strong interaction was no longer in 
the center stage. 

In 1974, Joel Scherk and John Schwarz [7] made a bold proposition that string theory should be 
envisaged as a theory of gravity since the massless spin two particle appears naturally in the closed 



614 


J Maharana 


string spectrum and this theory might be a vehicle to achieve the goal of unification of the forces of 
Nature. If string theory were to incorporate the gravitational interaction, then the string tension 
should be order of the Planck scale in contrast to the the tension of the original string which was 
of the order of one GeV, the scale of hadronic interaction determined from the slope of the Regge 
trajectories. It was realised that one has to go up nineteen orders of magnitude in the energy scale 
if the Scherk-Schwarz proposal was to be realized. At that time, this radical idea did not receive 
wide spread acceptance amongst theoretical high energy physicists. The crucial work of Green and 
Schwarz [8] in the summer of 1984 led to conclusion that 10-dimensional super Yang-Mills theories 
coupled to supergravity can be consistently constructed and are free from all anomalies [9] only for 
the gauge groups 50(32) and Es x Es, The results of Green and Schwarz had profound impact 
on the field of high energy physics. It was recognised that string theory could fulfill the cherished 
dream of unifying fundamental forces. The construction of the heterotic string theory [10] was a 
very important break through towards realization of this goal since it had the desired gauge groups 
i.e. 50(32) or £$ x Eg, depending on the construction one adopted. The ten dimensional theory 
had chiral fermions, N = 1 supergravity coupled to supersymmetric Yang-Mills with appropriate 
gauge groups. Moreover, when the Eg x Eg heterotic string theory was compactified to four 
dimensions on a Calabi-Yau manifold, the resulting theory was shown to possess several desirable 
features that one expected from some of the grand unified theories. Furthermore, it was possible 
to demonstrate that the standard model gauge groups 5?7(3) x 517(2) x C7(l) were contained in 
such four dimensional theories. Indeed, optimistically, one could feel that a unified theory was in 
sight and string theory was popularly named as the ‘Theory of Everything’. 

Let us recapitulate some of the essential features of string theory. The string is a one dimensional 
object which executes motion in spacetime. There are, grossly speaking, two types of strings: 
open and closed strings. As the name suggests, the ends of open strings are free (there are special 
types whose ends might get stuck to some surfaces and they play very important roles too) and 
it is required to satisfy suitable boundary conditions for the end points. The closed string, by 
definition, has its both ends glued together, forming a loop. It is well known that when a point 
particle evolves in spacetime, it traces out a trajectory describing its history. In case of an open 
string, it sweeps a two dimensional surface and similarly the closed string sweeps a surface which 
is that of a cylinder. The natural question is why we do not observe these strings in high energy 
collisions. The answer to this question lies in the fact that the strings are much smaller in size 
than the present accelerators can probe. If we could have accelerators which have energies of the 
order of 10^^ GeV, then it will be possible to observe the dynamics of the strings directly and test 
the predictions of string theory at the Planckian energies. In contrast, the present day accelerators 
have energies of the order of TeV - almost 16 orders of magnitudes below the string scale. 

The string has tension and it vibrates in an infinite number of modes. We identify each mode of the 
string with a particle. Of course, the string will have the lowest mode and we identify that with a 
particular particle. The next mode will correspond to an excited state and it is separated in energy 
from the lowest mode in suitable unit of string tension - separation between two neighbouring 
levels is order of 10^^ GeV (recall that for the hadronic models they excitations were on Regge 
trajectories and there the tension was order of GeV). The string theories of interests to us contain 
massless particles in their lowest mode. For example, in 10-dimensional heterotic string theory, 
we have graviton, antisymmetric tensor and dilaton together with the super Yang-Mills multiplets 
corresponding to the gauge groups 50(32) or Eg x Es in its massless sector. Therefore, in the low 
energy limit, the string theory effectively reduces to a point particle field theory (this is when we 
want to d^cribe physics at the present day accelerator energy scales). In other words, the zero 
slope limits of string theories correspond to known field theories - superstring theories go over to 
supergravity theories in this limit. 

Now we give an outline of the rest of the article. Since it is to appear in a volume on ‘Field 
Theory’, we shall avoid involved technical details. The field has progressed in diverse directions 
and our strategy will be to adopt a course to high light important developments. We shall attempt 
to pr^ent different aspects of string theory in a pedagogical manner. In order to get across some 
issues, known examples from field theory will be presented. There has been intense activities in 
this field since 1984, when it was recognised that string theory is the most promising candidate for 



Developments In String Theory 


615 


unification of forces of Nature. It is not possible to cover all the important literatures in a vibrant 
field like this within the frame work of this article. I apologize in advance to all the authors 
whose works have not been cited. There are two books which cover all the important aspects of 
string theory in detail besides several monographs and reprint collection volumes. The first one 
[11], in two volumes provides foundation for string theory and includes the developments up to 
1986. The second one [12] has laid the emphasis on the progress made after the second superstring 
revolution. I have listed some of the review articles written in the first phase of the developments 
of string theory [13, 14, 15, 16, 17, 18]. There axe a large number of review articles written in recent 
time [19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30] The next section deals with a brief review of 
the perturbative aspects of string theory to familiarize the reader with well known results. First, 
the string worldsheet action is introduced and the symmetries of the action are listed. A very 
quick exposition is given to the solutions of the equations of motion and mode expansions for the 
string coordinates and essentials of the Virasoro algebra are recalled. The evolution of the string 
in its massless background in the first quantized approach is discussed and the consequences of 
conformal invariance are noted. Section III deals with the symmetries of string theory. The theory 
is endowed with a rich symmetry structure in the target space besides the worldsheet symmetries. 
We introduce the duality symmetries since they play a very important role in our understanding 
of the string dynamics in various spacetime dimensions and they unravel the intimate connections 
between different string theories. The subsequent Section, IV, is devoted to to discuss the recent 
efforts to unify string theories. Besides duality sjmimetry, spatially extended objects, generically 
called p-branes, which appear as solutions to the effective action, are crucial to our understanding 
of string dynamics and to test some of the duality conjectures. We introduce some of the salient 
features of these objects and provide simple examples of the solutions. The raison de etre for M- 
theory is presented. We give an example how compactification of M-theory provides connections 
with the string theories and their various brane content. The fifth Section deals with issues related 
to black holes that appear in string theory. Since string theory describes gravity, it is expected that 
the theory will be able to provide insights into deep questions in quantum gravity. Indeed, some 
of the issues in the physics of the black holes have been resolved by string theory. It is known for 
more than two decades that a black hole is characterized by entropy and the Hawking temperature 
from the thermodynamic analogies. Moreover, the seminal work of Hawking demonstrated that the 
black holes radiate when quantum effects are taken into account. Recently, the black hole entropy 
has been computed as a microscopic derivation in the frame work of string theory. Furthermore, the 
absorption cross sections for incident particles and the distribution of Hawking radiation emitted 
from special class of stringy black holes have been evaluated from a microscopic theory. Section 
VI contains a brief account of the M(atrix) model. The M(atrix) model proposal to describe M- 
theory has drawn considerable attention. Some of the calculations in this model give surprising 
agreements with results of supergravity theories. Moreover, when one considers compactification 
of the model on torii the resulting theory can be related to supersymmetric Yang-Mills theories 
through duality. We discuss some of the features of the Maldacena conjectures in Section VII. 
According to the conjecture, in a concrete form, if one considers N coincident 3-branes of type IIB 
theory on AdSs x then the correlation functions of supergravity on the AdSs get related to 
correlation functions of the iV = 4 super Yang-Mills theory living on the boundary of the AdS^. 
This is a rapidly developing area and we shall be contented with some of the simple examples. In 
the last section we present an overview of the field. We make a few remarks to convey the reader 
how the work in string theory has influenced research in other branches of physics. 


2 Perturbative Aspects of String Theory 

We have outlined the historical backgrounds and the developments of string theory in its early 
phase in the previous section. In this section, we shall present some of the essential feature of 
string theory such as its quantization, the perturbative spectrum of theory and the supersymmetric 
version of string theory. 

Nambu had proposed the action for a string in analogy with the action for a relativistic point 



616 


J Maharana 


particle: the action for point particle in an integral over an line element; the string action is 
expected to be an integral over a surface. 

The Nambu-Goto action [34] was introduced almost three decades ago and has the form 

Sng = -t I d^cT^(x.x'y-ixy{X')^ (2.1) 

where a and r are the coordinates on the surface swept out by the the string, called ‘worldsheet’; 

and and we shall follow this definition all along unless specified otherwise. 

The equations of motion can be derived after specifying boundary conditions for the types of string 
one is dealing with. One important point to be noted is that the theory described by the above 
action satisfies two constraints 


n.x' = 0, + Tx '^ = 0 (2.2) 

where 11^ = ^ is the canonical momentum of X^ obtained from this action. We reserve the 
notation for the canonical momentum of the coordinate derived from the Polyakov action. We 
shall elaborate significance of these constraints later. 

However, this form of action was not very convenient to deal with the quantization of string and 
an alternative form of action was proposed by Polyakov [35] 

5 = -| y (2.3) 

'Jab is the worldsheet metric, 7 “^ is its inverse, 7 is determinant of worldsheet metric and 77 ^,, is 
the flat space metric of the target space. The variation of the action with respect to 7 ^^ results in 
the worldsheet stress energy momentum tensor 

Tab = daX.dbX - ^'Yabl‘^%X.ddX (2.4) 

Tab = 0, since there is no kinetic term i.e. as the analogue of Einstein-Hilbert piece, / is 

a topological term. We can solve for 7^5 from the above equation 

7a& = daX^dbX^ri^^ (2.5) 

If we insert the above expression for worldsheet metric into the Polyakov action, then we recover 
Nambu’s action. 

The action (2.3) has following symmetry properties. 

(a) Two dimensional repaxemetrization invariance 

hah = r5c7a& + da^hhc + db^hac (2.6) 

and hence = 5a The string coordinate transforms as 


5X^ = edaX^ 


(2.7) 


Weyl invariance 


hah — 2ft7a65 


Poincare invariance (in target space) 


(5X^=0 


( 2 . 8 ) 


5X^=uj^X^ + a^, d7a6 = 0 (2.9) 

where are antisymmetric parameters associated with the Lorentz transformation and are 
the parameters of translation. 

Note that the Weyl invariance implies tracelessness of the two dimensional energy momentum ten- 
sor for the classical theory. The quantum invariance of this symmetry has far reaching consequences 
in string theory. 



Developments In String Theory 


617 


If we make the orthonormal gauge choice for the worldsheet metric jab = with 

rjab = diag(~l, +1) the, form of Polyako^’' action simplifies since in this gauge. The 

condition of the vanishing of Tab reduces to two constraints 

{X±X^f = 0 ( 2 , 10 ) 

These are the Virasoro constraints. They take the following form in the Hamiltonian formalism 

Pi,x"^ = 0, H =l(P^ + TX'^ = 0) (2.11) 

where is momentum conjugate to X^ derived from Polyakov action. It is easy to see that 
the first constraint generates a translation on the worldsheet, whereas latter being the canonical 
Hamiltonian generates r translation. 

The equation of motion for the string coordinates, in the light-cone variables = r + a and 
z=zr — CT^ are given by 

= 0 (2.12) 

We note that the equation of motion is derived with following boundary conditions: (i) .Y^(r, a + 
27r) = JY^(r,cr) for the closed strings, and (ii) X’^ = 0 for cr = 0 and cr = 27r in the case of open 
strings, when we apply the variational method. 

Let us illustrate the mode expansion for the closed string starting from the equation of motion with 
periodic boundary condition in a. We first note that the string coordinate can be decomposed as 
a sum of left-moving and right-moving coordinates. 


X^(T,<7)=X^(r + a)+X^(r-<T) 
Then the two can be expanded as follows: 




^ — tmfT+(r) 


2. 


m 


(2.13) 


(2.14) 





m 


(2.15) 


The sum is over all integer values of m (m = 0 is excluded) in the above equations, and 
are the Fourier modes. Since X^ j^ are real, so are a:'* and the Fourier modes satisfy 


«)*=< 




(2.16) 


from the reality condition. For the closed string case, the classical Hamiltonian is given by 

H = am-a-m + OLm.Ot-m) (2-17) 

in terms of the Fourier modes. The constraint, Tab — 0) obtained from the Polyakov action, takes 
the form T— = \{d-XY = 0 and r++ = \{d^Xf = 0, in terms of the light-cone coordinates, 
after one has gone over to the ON gauge. It is more convenient to express these constraints in 
terms of the Fourier modes introduced above and define the Virasoro generators 



Ckm— n^nj 


1 _ 
\ rv.. 




(2.18) 


And 


H = jLq T* Lq 


(2.19) 


We caxt obtain classical Poisson bracket relations amongst Lmj similarly for the set starting 
from the canonical Poisson bracket between X^ and P^. 


[L,nj LtiIfb = -i{rn — n)Lm^n 


( 2 . 20 ) 



618 


J Maiarana 


[im,in]pB=0 (2.21) 

the PB between {Lm} is same as for {im}- These are classical Virasorp algebra. 

We have noted earlier that the string theory is endowed with local symmetries in the worldsheet 
and the action is that of D-scalar fields in 1 4- 1 dimensions, since /i = 0, 1...D - 1 takes D values. 
When we proceed to quantize this theory, we encounter problems similar to the one faced in 
quantization of gauge theory. In other words we have to fix the gauge here too. One can choose to 
work in a noncovariant gauge which has the advantage of dealing with physical degrees of freedoms 
directly, but at the price of losing manifest Lorentz covariance. On the other hand one can adopt 
covariant quantization prescription with all its elegance and power. The light-cone quantization, 
although noncovariant, is very useful and gives us a physical picture. As the first step, the classical 
constraints are solved and one is left with less number of variables Recall that there were some 
remnant symmetries after choosing conformal (ON) gauge: = Ai(^-}.) and = A 2 (^-). One 

can utilise this property to write 

+ a'p^T ( 2 . 22 ) 

Defining the light-cone string coordinated = X^ ±X^ one can impose the classical Virasoro 
constraints {X ± X'Y = 0. Thus A” is determined in terms of the rest of the (transverse) co- 
ordinates, and in this process both X^ and X~ cire totally eliminated and we are left with 
{X*}. Then the oscillators of these coordinates will create the states which could be identified with 
particles with physical degrees of freedom only. So it gives us a physical picture of the states. How- 
ever, as mentioned earlier and as is the case with noncovariant gauge fixing in QED or Yang-Mills 
theories, the Lorentz invariance must be checked explicitly. For the case of string theory, one is 
required to construct the generators of Lorentz transformations and ensure that the generators sat- 
isfy the algebra. It is well known that this requirement is not fulfilled unless the string propagates 
in 26-dimensional spacetime. On the other hand, if one adopts the covariant BEST procedure, it is 
necessary to add the ghost term to the action and construct the corresponding Virasoro generators 
for the ghosts. Thus the full Virasoro generator is a sum of the oscillators coming from string 
coordinates and those from the ghosts. When we compute the quantum Virasoro algebra, there is 
an anomaly of 26 from the ghost sector which gets precisely canceled if the spacetime dimension 
is 26 since each bosonic degrees of freedom contributes a factor of one to the anomaly with a sign 
just opposite to that coming from ghosts. 

There are infinite tower of states in string theory. It is useful to arrange them according to their 
oscillator levels. Notice that the worldsheet degrees of freedom of the string are envisaged as a 
collection of infinite number of harmonic oscillators. If we consider creation operator of one of these 
oscillators, we could define a level such that the number of of units of worldsheet momenta created 
by this operator while acting on the vacuum. If we have a state, then the total oscillator level of 
that state is the sum of the levels of all the oscillators acting on the Fock vacuum to create this 
state. For the free string, the coordinates can be decomposed into left moving and right moving 
sectors. Therefore, one can define left and right moving oscillator oscillator levels (same decompo- 
sition is valid when we add fermionic degrees of freedom). Thus one can write Lq = \{E-^P) and 
Lo = \{E — P)y where E and P are worldsheet energy and momentum respectively. Therefore, 
Lo 2 md Lo get contributions from the oscillators and from the Fock vacuum. We may remark in 
passing that the momenta of the spacetime D-dimensional theory (25 -f .1 for bosonic string and 
9 + 1 for superstring) are the ones conjugate to the zero modes of the bosonic/and/or fermionic 
worldsheet theory. Therefore, the ground state of the closed bosonic string is a tachyon satisfying 
the relation = —4, with o! = The first excited (massless) states of closed string are: 

♦ Spin 2 state, identified as graviton. 

♦ An antisymmetric tensor field, 

♦ A scalar, called dilaton. 

They belong to the irreducible representation of the 50(24) group. These states are created by 
action of a single creation operator from the left moving sector and another creation operator from 
the right moving sector. Therefore, they will have two target space Lorentz indices and one can 
decompose them according to irreducible representations of the corresponding rotation group. 



Developments In String Theory 


619 


Introduction worldsheet fermions has important consequences. In fact, if one demands world- 
sheet superconformal symmetry generalising from the bosonic string coordinates to include fermionic 
degrees of freedom, then resulting theory is the superstring. First we need to construct two di- 
mensional supergravity action. One needs to add to the action (2.3) the action 

(2.23) 

The notations are as follows [37]: are worldsheet two component Majorana fermions, are 

the zweibeins associated with the worldsheet metric, e is its determinant. is the gravitino on 
worldsheet satisfying A* = A^. The gamma matrices in the worldsheet have following represen- 
tations: 7° = (72,7^ = iai and 75 = 7^7^ = 0-3, (ji being the three Pauli matrices. We shall go 
over to the superorthonormal gauge, where the worldsheet metric is flat metric times a conformal 
factor (mentioned already) and gravitino is chosen to be A^ = 7aC where ( is a constant Majorana 
spinor. Then the action (2.23) takes a simple form and is expressed in terms of the Weyl Majorana 
fermions (it is a free fermion theory now) 

-y / (far[iplidr - d^)xl)+^ + Tptidr + (2.24) 

with the definition of the chiral fermions: ^4. = |(1 - 75)^ and = |(1 + 75)^, the space- 
time index is suppressed- Now it is evident that the fermion equations of motion will separated 
according to the chiralities, as is expected for massless fermions. The worldsheet supersymmetry 
transformations are 

SX^ = eV (2.25) 

5^^ = ^iy^daX^e (2.26) 

For the two component Majorana fermions; e is the fermionic parameter associated with the 
supersymmetry transformation. The supercharge is the time component of supercurrent integrated 
over a variable. The current is 

(2.27) 

Next, one defines the super Virasoro generators and compute the quantum algebra and derive the 
condition for absence of anomaly. In case of the superstring the critical dimension is ten in contrast 
to bosonic string where it was 26, 

Now we shall consider a few points before discussing how spacetime supersymmetry multiplets 
appear in the spectrum of the superstring. We had mentioned that the bosonic string has a tachyon 
in its lowest level which will render the theory unstable. Although, worldsheet supersymmetric 
theory moves in ten dimensional spacetime, the super Virasoro algebra does not impose suflScient 
constraints to remove the undesirable tachyon from the spectrum in general. 

Notice from the fermionic equations of motion (we suppress the bosonic part momentarily to focus 
attentions on fermions only) that there is some freedom in the choice of the boundary condition as 
a goes over a period of 2'ir. The is due to the fact that the action remains invariant under -ij) 
for fermions of either chirality. The boundary conditions are: 

ip{c7 -f 27r) = (2*28) 

known as Neveu-Schwarz boundary condition is antiperiodic[38]. The periodic boundary condition 

%p{a + 27r) = ^(cr) (2.29) 

is the Ramond condition [39]; the indices axe suppressed for notational convenience. The mode 
expansion for, say the holomorphic field, is 

+ (t) = (2-30) 

It is easy to see that for Ramond boundary condition, n must be integers. When we imp^ NS 
(Neveu-Schwarz) boundary condition and expand the fermions in Fourier modes, then n wih take 



620 


J Maharana 


half integer values. We note that the NS fermions have no zero modes, whereas the Ramond 
fermions have zero modes in the Fourier expansions. 

Let us extend the arguments, we used for the bosonic string, for the superstring and examine 
their spectrum. The aim is to get rid of the tachyon and to construct states using bosonic and 
fermionic operators such that these states transform like fermions and the resulting theory be 
endowed with spacetime supersymmetry. We shall consider the light-cone gauge so that physical 
degrees of freedom become transparent. In addition to the condition X'^ = rr"*" one imposes 

constraint 

t/;+ = = 0 (2.31) 

for the NS fermions, when we have Ramond fermions, they can be set to zero e_xcept for the zero 
modes. Now we look at the superconformal constraints and solve for in terms of the 

rest of the coordinates. Thus we can use the (physical) transverse oscillators of both X and to 
construct the physical states and keep in mind the presence of appropriate zero modes. It follows 
from straight forward calculation that the ground state in the NS sector is tachyon. The next 
level obtained by operating contains massless states. Thus we need to remove the tachyon a.s 
well as some of the unwanted states, at the same time, keeping the massless spectrum in tact. 
Notice that worldsheet fermions are anticommuting objects, although they create bosonic states 
while operating on a state of the theory. This feature is not very desirable as will be evident 
from the following example. Let us consider a specific bosonic state of a superstring and then 
operate on it a worldsheet spinor, obeying NS boundary condition. The resulting state will 
still be an integer spin object even if we have operated by anticommuting operator; this is rather 
unusual. We can think of a situation when odd number of NS operators act on a bosonic state 
and obviously same situation will continue to prevail, whereas for even number of such operators 
we face no problem since even number of anticommuting operators can be grouped to behave like 
bosonic operators. If we demand that all the states be even under (-1)^, then half of states which 
had above mentioned undesirable feature, are removed including the tachyon. This is the GSO 
[40] projection. Moreover, after the unwanted states have been discarded from the spectrum, the 
remaining states of the theory belong to the representations of spacetime supersymmetry when we 
consider full spectrum of the superstring theory. Note that the operator (—1)^ is defined up to 
a sign ambiguity. If we choose the sign convention that the first excited state has (—1)^ = -1-1 
which arises due to action of on the ground state, then we can fix (—1)'^ quantum number of 
the rest of the states. In this sign convention, tachyon will carry quantum number -1. There is 
another convention where tachyon has quantum number +1 and then the massless, first excited 
state, carries quantum number -1. The fermion numbers Fl and Fn can be introduced separately 
for the left and right moving sectors respectively. When one computes supercharge algebra with 
Ramond condition, the zero modes of the fermions in supercharge give an anomaly term besides the 
Lq term (that is Hamiltonian) and anomaly vanishes for D = 10. Moreover, the anticommutation 
relation of the R-zero modes are like Dirac gamma matrices carrying target space indices. One 
finds that massless states appear in the R-sector and they satisfy Dirac equation. They transform 
as ten dimensional spinors (S) or conjugate spinors (C). Since we are considering the left moving 
sector here at the moment, S has -1-1 eigenvalue and C has^-1 eigenvalue under the (-1)-^^ . When 
we construct other excited states on these states they turn out to be massive. In view of this one 
need not apply GSO projection, no tachyon is to be removed. 

When we combine the left and right moving sectors four combinations will appear in the descrip- 
tion of the closed string spectrum. NS-NS, NS-R, R-NS and R-R; where the first sector is from 
left movers and second is from right movers in the above four combinations. Let us look at them 
one by one. 

(i) NS-NS: The states are created due to the action of the creation operators from the left and 
the right moving sector. They will transform as tensors under 10-dimensional Lorentz transforma- 
tion. After GSO projection is implemented, the lowest lying states are massless and they can be 
decomposed into three groups, symmetric traceless, antisymmetric tensor and a scalar under the 
rotations. 

(ii) NS-R: The GSO projection, as discussed is (-1)^^ = 1 and one keeps the S representation of 



Developments In String Theory 


621 


the R sector here. The the massless states consist of spacetime spinors. 

(iii) R-NS. Here the GSO projection on NS sector from right side gives fermion number 1. We 
have the choice of keeping S spinor or the C spinor and obviously the states are spinorial. 

(iv) R-R: The fermionic operators act from both sides and therefore, the resulting state will be 
bosonic in character- It will depend what combination we decide to keep. For example, if one keeps 
S from left side and C from right side the product decomposes into a vector and a three index anti- 
symmetric tensor (has to be antisymmetric - it arises from anticommuting objects). These belong 
to the bosonic sectors of type IIA theory. There is other combination which S from left and S from 
right combine and their decomposition is a scalar, 2-form potential and 4-form (antisymmetric) 
potential whose field strengths are self-dual in ten dimensions and these states are bosonic sector 
of type IIB theory. 

We are in a position to classify string theories according to their important characteristics. There 
are two 10-dimensional theories which have N = 2 supersymmetry in target space. Their massless 
bosonic sectors are as follows: type IIA has graviton, antisymmetric tensor and dilaton, 
0, coming from the NS-NS sector and a gauge potential and three index antisymmetric tensor 
potential coming from the R-R sector. These two theories have 32 generators of supersym- 
metry; type IIA is called non-chiral theory whereas type IIB is known as chiral theory. Although 
the bosonic fields coming from the RR sectors in these two string theories are tensors of different 
ranks, the total number of degrees of freedom of these tensors in each of the theories (A and B) 
are the same [41] and this can be checked by counting the physical degrees of freedom RR gauge 
fields of type IIA and IIB. 

Next, we introduce the heterotic string which is very attractive when one tries to establish con- 
nection of string theory with the gauge groups of the standard model. The heterotic string, in 
ten dimensions, contains N = 1 supergravity multiplet, super Yang-Mills gauge theory along with 
chiral fermions. There are two possible choices for the gauge groups: 50(32) or Es x Es, in 
the construction of the heterotic string. Therefore, heterotic string theory fulfills Green-Schwarz 
anomaly cancellation condition. Moreover, when the theory is compactified to four dimensions 
on Calabi-Yau manifold, the resulting theory has many features of the standard model and the 
gauge group SU{3) x 50(2) x 0(1) can be embeded in the 4-dimensional theory. Let us briefly 
discuss how the heterotic string is constructed. We discussed the closed bosonic string and noted 
that the string coordinates can be decomposed to left movers and right movers and each can be 
expanded in Fourier modes. Moreover, the Virasoro generators are also separated into two groups, 
one group is expressed in terms of oscillators of one kind only (say left mover) and the other group 
of generators are expressed in terms of the oscillators of the other types (left movers). When one 
computes the quantum algebra, the anomaly free condition is imposed on each groups of Virasoro 
generators. In case of a closed string with worldsheet supersymmetry, same situation appears, 
because the fermion equations of motion is also written in terms of equations of motion of the 
Weyl Majorana fermions. If we were interested in constructing a string theory which satisfies re- 
quirements of conformal invariance, we could have a left moving closed bosonic string and a right 
moving superstring. The former will satisfy Virasoro algebra and latter the super Virasoro algebra. 
The triumph of the Heterotic string is that, when we look at the massless spectrum of the theory, it 
has N = 1 spacetime supersymmetry, contains the appropriate gauge groups (50(32) or Eg x Eg) 
as is required for the consistency due to Green-Schwarz anomaly cancellation condition. Therefore, 
the closed bosonic string has 16 of its spatial coordinates compactified so that those coordinates 
themselves are periodic. Furthermore, using the standard techniques of 1 + 1 dimensional field 
theory, the compact bosonic coordinates could be fermionised to give 32 Weyl Majorana fermions 
which are left movers. Thus, we have 10 bosonic coordinate and their 10 super partners (in 
light-cone gauge 8 bosons and 8 fermions) in the right moving sector and 10 bosonic coordinates 
and 32 fermions (from compact coordinates) on the left moving sector. Whenever, we adopt NS 
boundary conditions for these fermions arising out of compactification, tachyon will appear in the 
spectrum. Of course, by introducing GSO projection on the right moving sector we shall have 
spacetime supersymmetry. So far as right moving part is concerned, bosonic stat^ come from 
states with NS boundary condition and fermions arise due to the Ramond boundary conditions. 
The choice of boundary conditions on the left moving fermions (coming from compact directions), 



622 


J Mabaraaa 


give rise to two different types of gauge groups, (i) All the left moving fermions can satisfy R-type 
(periodic) boundary condition or they can satisfy NS-type boundary conditions. Then there is 
GSO condition which ensures that there are only states which have even number of these fermions 
(only one type boundary condition). Thus the massless bosonic spectrum is given by symmetric 
second rank tensor field, antisymmetric tensor field and a scalar together with 496 gauge bosons 
belonging to the adjoint representation of 50(32). (ii) The second possible choice of boundary 
condition for the left moving fermions is to divide them to two groups containing 16 fermions. Now 
there are four choices of boundary conditions (a) All satisfy R boundary conditions, (b) periodic 
(R) boundary condition is imposed on both the groups, (c) all the fermions in first group (call it I) 
have R boundary condition and the group II has NS antiperiodicity and finally (d) group I belong 
to NS boundary condition and II are in R. The GSO projection is such that it keeps even number 
of fermions from each group in the spectrum in every sector. When one works out the bosonic 
spectrum, it contains again second rank symmetric tensor, antisymmetric tensor of rank two, the 
scalar, dilaton and 496 gauge bosons in the adjoint representation of Es x Eg- 
There is another superstring theory, known as type I. A simple way to describe type I string is 
from the perspective of IIB theory. Consider the parity operation P on the worldsheet such that 
the ^spatial’ coordinate a -a under P. In type IIB theory, P exchanges left and right moving 
sectors. Now, if we demand that we retain only those states which are invariant under P, we get 
the type I string. In the NS-NS sector, graviton and the dilaton survive; the antisymmetric tensor 
is removed. ^From the RR sector, the only surviving field is the second rank antisymmetric tensor. 
Moreover, there are Weyl Majorana fermions and a gravitino surviving the operation giving rise 
to AT = 1 supergravity multiplet. The open string states are also included in type I spectrum. In 
this case, the worldsheet degrees of freedom are same as in the closed string case. One imposes 
Neumann boundary conditions on the bosonic coordinates and suitable boundary conditions on 
worldsheet fermions. The gauge group that can get attached to the open string is 50(32) and thus 
there is corresponding super Yang-Mills theory besides the states we mentioned above. 

Thus there are five perturbatively consistent string theories. The scattering of particles belonging 
to spectrum of a string theory can be described by introducing vertex operators [43]. They are 
required to satisfy constraints due to conformal or superconformal transformations. They must 
transform as representations of Lorentz group, like a wave function. In the first quantized frame 
work, one can calculate scattering of these particles in a well defined perturbation theory. It is 
one of the great virtues of the superstring theories that all these calculations are ultraviolet finite. 
Therefore, we have five different string theories in ten dimensions. 

One of the most efficient ways to study properties of string theory is to investigate the evolution 
of a string in the background of its massless excitations and then explore the consequences of 
conformal invariance for such a situation. Let us consider closed bosonic string in the background 
of its massless excitations such as graviton, antisymmetric tensor and dilaton. The action (2.3) 
generalizes to 

-|( J i y (2.32) 

Here is the scalar curvature of the worldsheet computed with yab- The first two terms show 
the couplings of Guv and B,iv to the string coordinates. In close string theory there is a massless 
state which transforms as symmetric second rank tensor and it is identified as graviton and there 
is an antisymmetric massless second rank tensor state. The above action describes motion of the 
string in the background of these massless states, G^v and B^v\ the last term is the coupling of 
the string to the massless scalar, the dilaton. This is an action for a two dimensional cr-model 
and we can interpret that Gf^v and B^v play the role of coupling constants. At the classical 
level the dilaton coupling breaks the conformal invariance explicitly. However, it is important to 
explore the consequences of the quantum invariance as we have seen that the quantum invariance 
principle imposes strong constraints on the theory. There is a well defined procedure to compute 
the conformal anomaly for such theories[44]. One of the ways to ensure conformal invariance of 
the quantum theory is to demand that the- two dimensional energy momentum stress tensor has 
vanishing trace. As is well known, the conformal anomaly is related to the corresponding /3-function 



Developments In String Theory 


623 


of the theory. Thus, vanishing of the /3-functions will ensure conformal invariance. Moreover, the 
beta functions can be computed order by order in the cr-model perturbation theory; being the 
expansion parameter. The relevant /3-functions are: 

6 ^ 1 

= Rfiv — -h (2.33) 

^ = (2.34) 

= A + 3a'[(V^)2 - 2V^V^<i> -R+ (2.35) 

The notations are as follows: is the Ricci tensor for the target space computed from the 

string frame metric A = D — 26 or — 10 depending on whether we are dealing with a 

pure bosonic string or superstring (if we deal with superstring the coupling of worldsheet fermions 

to the background has to be taken into account), D being the spacetime dimension. Hf^up = 

d^Bpp ’hcycLpeim, is the field strength of two form potential It might be worthwhile to point 

out that for the constant value of dilaton the last term in (2.32) is just the Euler character of the 
surface. When we write the path integral form with the action, we see that the factor comes 
out; where x is the Euler character and <j)Q is the constant value of the dilaton. In this light the 
string coupling constant is defined as 

9stv = (2.36) 

Let us look for an action in the target space such that the variation of that action with respect 
to the backgrounds Gpj,,Bpu and (j) would reproduce the /3-function equations we have obtained 
earlier. We also know that these ^-functions must vanish (to the order in o' they are computed) 
in order to respect conformal invariance of the theory. 

The resulting action is 

S = j d^xV^e-^{R + {d4>f - (2.37) 

This action is called the tree level string effective action. Solutions of the equation of motion of 
this action (same as solution to /3-function equation) correspond to admissible background config- 
urations with respect conformal invariance. In other words, every solution is an acceptable vacuum 
of the string theory to lowest order in a' since the effective action is obtained from the /3-ftmction 
equations keeping only lowest order terms in cr-model perturbation theory. Therefore, if we find 
solutions which correspond to cosmological situation with given G, B and <j>, or a black hole so- 
lution, or a wormhole solution all these types of geometries with the appropriate matter content, 
consistent with the equations of motion, can be interpreted as string vacuum backgrounds. 

So far we have been discussing the quantization of string theories and examining the consequences 
of conformal invariance. Note that all the consistent string theories axe defined in spacetime dimen- 
sions higher than four i.e. D = 10. Therefore, one must answer the question what these theories 
have to do with the spacetime where we live. This issue has been taken up by Kaluza and Klein 
more than seven decades ago. The basic idea is rather simple. In order to construct a unified the- 
ory of gravity and electrodynamics, they considered an Einstein-Hilbert type action in 5-spacetime 
dimensions which is invariant under general coordinate transformations in five dimensions. Let us 
imagine that one of the dimensions, the 5th one, is a circle of very small radius which could not 
be probed today using any particle whole de Broglie wave length is comparable to the size of that 
circle. Then we shall not be aware of this scale. Let us assume, to a first approximation, that the 
metric does not depend on the 5th coordinate. Kaluza and Klein showed that the resulting theory 
looks like Einstein theory and Maxwell theory in four dimensions. What was general coordinate 
invariance in 5-dimensional theory, turned out to be general coordinate transformation and Abelian 
gauge transformation (of Maxwell theory) in four dimensions. Although, the original KaJuza-Klein 
proposal had many short comings, the idea is very relevant for construction of four dimensional 
theories starting from the lO-dimensional string theories in the present context. We shall explore 
this aspect and we shall see how duality symmetries arise for compactified string theories. We 
shall set T = 1 from now on, whenever, we shall need to introduce the slope parameter/tension , 
we shall explicitly mention in that context. 



624 


J Maharana 


3 Duality Symmetries in String Theory 

One of the marvels of the string theory is its rich symmetry structure. We have noticed how the 
conformal invariance imposes strong constraints on the theory: when we consider flat spacetime the 
dimensionality is fixed by this symmetry. On the other hand if we consider strings in backgrounds, 
we get the equations of motion for them by demanding that the corresponding /3-functions must 
vanish. Moreover, there are local symmetries like invariance associated with general coordinate 
transformation due to the presence of the graviton and an Abelian gauge symmetry since the 
antisymmetric tensor is also a part of the massless multiplet. 

The duality symmetries play a crucial role in understanding various features of string theory. Since 
string is an extended object, there are symmetries special to string theory. Consider a particle 
whose motion is on a circular path, the momentum is quantized in suitable units of the inverse 
radius in order that the wave function maintains single valuedness. However, in case of a string, 
one of whose coordinate has geometry of a circle, offers more interesting possibilities. In fact a 
string theory with one spatial direction compactified as of radius R cannot be distinguished 
from another theory whose coordinate is compactified on a circle of radius ^ . Let the compactified 
coordinate be denoted by Y (a, r) with the periodicity condition 


V (o', r) -f 27rE = Y (a, r) 


(3.1) 


Furthermore, the string coordinate is also periodic when a goes over 2tt for the closed string. Since, 
the coordinate is compact, zero momentum mode must be quantized to maintain single valuedness 
of the wave function just as the case in field theory. In case of the string, the string can wind 
around the compact direction. It will cost more energy if the string winds m-number time, because 
it will have to stretch more. Therefore, the effect due to windings has to be taken into account too 
while estimating energy levels [45]. Thus the mode expansions for left and right moving sectors 
are: 


Yr = yR + y 

— cr) + oscillators 

(3.2) 

YL=yi, + y 

-H cr) -h oscillators 

4b 

(3.3) 


The momentum zero modes will have the following form to be consistent with what we 
said earlier 

In In 

PR = -^{--Rm), and PL = ^(- + iim) (3.4) 

The above equation states that in general the contribution of the Kaluza-Klein mode is times 
an integer and the winding mode is an integer times the radius. The total momentum is just 
P = -^(Pr + Pl), which is integral of momentum density over a. The total Hamiltonian is 


H 


Lo + Lo = -(pi 4-p/{) + oscillators 


(3.5) 


Now we consider the general case of toroidal compactiflcation and present the derivation as was 
done in reference [49]. Let Ga^andBcia be constant backgrounds, a,/? = 1, ...d, and y“(cr,T) are 
the string coordinates. The two-dimensional cr-model action containing these coordinates is 

^compact ~ 2 J \paP‘ff'^doX°‘dby^ + Ba^daY°‘dbY^'\ (3.6) 


where Gap and Bap are constant backgrounds. The coordinates are taken to satisfy the periodicity 
conditions ~ F“ -|- 27r. Here we take the compactification radius to be unity for simplicity in 
calculations. For closed strings it is necessary that 


F“(27r,r)=y“(0,r)+27rm“ 


(3.7) 



Developments In String Theory 


625 


where the integers are called winding numbers. It follows from the single-valuedness of the wave 
function on the torus that the zero modes of the canonical momentum, Pa = GasdrY^ + BafsdaY^ , 
are also integers n^- Therefore the zero modes of Y^ are given by 

=y^+ m^a + - Bp^n^)T (3.8) 

where G^^ is the inverse of Gap’ The Hamiltonian is given by 

n = \Ga0{Y°‘Y^ + Y'°‘Y'^) (3.9) 

where and Y'^ are derivatives with respect to r and a, respectively. Let us elaborate a little 
bit on the significance of what we have done with respect to the compact coordinates. Since 
the coordinates are compact, they satisfy. eq.(3-7). Moreover, these coordinates can be 
expanded as usual in terms of their zero modes and the oscillators. However, for the discussion of 
T-duality, we focus our attentions on the zero mode parts and the contribution of these parts to 
the Hamiltonian, given above. 

Since y"(cr, r) satisfies the free wave equation, we can decompose it as the sum of left- and 
right-moving pieces. The zero mode of = G^'^Pp is given by pf -^p% where 

Pl = - B 0^rn>)] (3.10) 

Pr = + G‘^^in0 - (3.11) 

The mass-squared operator, which corresponds to the zero mode of is given (aside from a 
constant) by 

oo d 

(massf = Ga0 (pxpf + PrJ^r) + ^ X! (3-12) 

m=l 1=1 

As usual, {a^} and {5^} denote oscillators associated with right- and left-moving coordinates, 
respectively. Substituting the expressions for pi, and pr, the mass squared can be rewritten as 


(mass)^ = ^Ga0'ni°‘Tn^ + ^G“^(na - Boi-^wP){n0 - B0sTn^) + ■*" (3-13) 

It is significant that the zero mode portion of (3.13) can be expressed in the form 

(Mof = lim n)M-^ , (3.14) 

where M is the 2d x 2d symmetric matrix expressed in terms of constant backgrounds G and B 


f G-i -G-^B 
\BG-^ G-BG-^BJ 


(3.15) 


In order to satisfy a-translation symmetry, the contributions of left- and right-moving sectors to 
the mass squared must agree; Lo = Lq- The zero mode contribution to their difference is 

Gapiptp^j, - PrA) = (3-16) 

Since this is an integer, it always can be compensated by oscillator contributions, which are also 

integers. . v j- 

Equation (3.16) is invariant under interchange of the winding numbers Tn“ and the (tecrete 
momenta Ra- Indeed, the entire spectrum remains invariant if we interchange m“ -H- ria simulta- 
neously let [46] 



626 


J Mi 


{G - ^ and o -G ^B 

These interchanges precisely correspond to inverting the 2d x 2d matrix M . This is the 
duality transformation generalizing the well-known duality R ^ in the d = 1 cas6 
earlier. The general duality symmetry implies that the 2d-dimensional Lorentzian latti 
by the vectors V2{pi, p%) with inner product 

(pl, Pr) • V 2 {p'l^ p'ji) = 2Ga(3(PLPL '^PrPr) ~ ^ 

is even and self-dual ([47]).i For toroidally compactified string theory, the coordinat 
riodicity condition and the conjugate momenta belong to the dual space and are 
suiatble units. Purthemore, one can define corresponding metric to introduce the 
coordinates and their dual momentum vectors and define an inner product also, 
lattices the space of the coordinates (since the coordinates satisfy periodicity con 
crystals) is the same as the dual space, then the lattice is called self-dual. Of spec 
are the spaces where the length of the vector is even (with the definition of norr 
we have even self-dual lattice. These types of lattices are very important in const 
theories with nonabelian gauge groups and to satisfy consistency requirements o^ 

The moduli space parametrized by G^/? and is locally the coset 0{d 
The global geometry requires also modding out the group of discrete symmet 
Bc^fS Ba 0 + Na /3 and G 4- S (G -f- These symmetries generate the C 
of 0(d,d). An 0(d,d, Z) transformation is given by a 2d x 2d matrix A hav* 
and satisfying A^rjA = r], where rj consists of off-diagonal unit matrices define 
0 {d^djZ) transformation 


It is evident that 


m-n = ^(rn n)r] (” 


which appears in eq.(3T6), and Mg in eq.(3.13) are preserved under the* 
that rj is symmetric 2dx2d matrix with off diagonal elements which are d-c 
The crucial fact, already evident from the spectrum, is that toroidally c 
certainly does not share the full 0(d, d) symmetry of the low energy effi 
invariant under the discrete 0(d, d, Z) subgroup. 

So far, in discussing issue compactifications^ we have considered situatic 
are compact. However, one can envisage the scenario, when some of t 
are compactified and the rest are noncompact. Furthermore, we tre 
constant; however, in more realistic situations the backgrounds shoi 
noncompact coordinates. This is the more interesting situation whe* 
string theory and six of its spatial coordinates are compactified on a 
theory is reduced to a four dimensional effective theory. We Shall ad 
dimensional reduction [48, 49, 50] so that m Compactify an arb' 
that the effective theory is defined la a lower spacetime dimension, 
be useM, smce the duality conjectures are in various spacetime dir 
related by the web of dualities in diverse dimensions. 

The starting point is to consider the string effective Mtion in 

ccwrdinat^, metric and all other tensors in th8 A dimensional 
The coordinates in D-dimensional spacetime are denoted by r 
Therefore, D = D d. The theory is compaxitified on a d-dime 


) 

1 

e 

e 

s. 

)n 



Developments In String Theory 


627 


spacetime. The coordinates on the torus, sometimes referred to coordinates of internal dimensions, 
are denoted as = 1, ...d. The bosonic part of the action is given by 

S = J d^x yf^e-^[R{G) + ^ (3.22) 

Note that S is the bosonic part of the string effective action with backgrounds coming from NS-NS 
sector. H is the field strength of antisymmetric tensor and ^ is the dilaton. The backgrounds are 
taken to be independent of the internal coordinates, of the torus. Consequently, any transfor- 
mations of the coordinated y“,a = 1,2, ..d does not affect the background fields and we recognize 
that there are d isometries. Furthermore, associated with these isometries, there will be d Abelian 
gauge fields since the jD-dimensial metric will have components carrying a D-dimensional spacetime 
index and an internal index a. There will be components of the T)-dimensional metric which will 
carry indices of the toroidal coordinates, say a,/? and these will transform as scalars, often refer 
to as moduli. Similarly, if we consider the components of the J9-dimensional antisymmetric tensor 
field it will have D x D component antisymmetric tensor, d Abelian gauge fields coming from 
spacetime and internal component and dxd dimensional moduli (antisymmetric) when considered 
from D-dimensional point of view. 

The metric Gp,i/ can be decomposed as 

fa .4(1)74(1) 4(1) \ 

^ (3.23) 

\ Ga(3 J 

where Gap is the internal metric and the D-dimensional space-time metric, depend on the 
coordinates x^. Note the appearance of Abelian gauge fields due to the presence of the 
isometries. We also expect same number of gauge fields from the antisymmetric tensor Thus 
The dimensionally reduced action is, 


Here <^ = (^ — | log det G is the shifted dilaton. 


(3.24) 


Hfit/p — d/iBi/p 


+ (eye- perms.), 


is the 2(i-component vector of field strengths 


P = 

•J ^1/ 



= d^,Al - dpA^^ , 


(3.25) 


(3.26) 


A^l = + BapA^p^^ (recall Bap = Bap), and the 2d x 2d matricra M and jj are defined as 


/ G-i -G-^B \ 

{bg-^ g-bg-^bJ’ 



(3.27) 


Note that the elements of the matrix M, Gap and Bap depend on spacetime coordinates i" in 
contrast to the earlier case (3.15) where those back grounds were taken to be constant. The action 
(3) is invariant under a global 0{d, d) transformation, 


M -> =v, A^p-^ ^'jAl, where ft € 0(d,d). 


(3.28) 



628 


J Maharana 


and the shifted dilaton, remains invariant under the 0{d,d) transformations. Moreover, M E 
0{d,d) and M'^rjM = rj. Thus if we solve for a set of backgrounds, M,T and </>, satisfying the 
equations of motion they correspond to a vacuum configuration of the string theory. The 0{d^d) 
symmetry is known as the target space duality (or T-duality) symmetry, it is a stringy symmetry 
and there is no analogue of winding modes in ordinary field theory. The symmetry holds good order 
by order in string perturbation theory. Therefore, predictions of T-duality can be tested within the 
frame work of perturbation theory. We remark in passing that, if we had considered an effective 
action in D dimensions with n Abelian gauge fields, the reduced action in D dimensions will be 
invariant under 0{d,d-\- n) symmetry. This is of importance, since in case of the heterotic string, 
the ten dimensional action with 16 Abelian gauge fields corresponding to the Cartan subalgebra 
of the nonabelian gauge groups of the theory, when reduced to lower dimensions with exhibit the 
symmetry 0(d, d + n) we mentioned. 

Thus if we have a set of background configurations it is possible to generate another set of gauge 
inequivalejit backgrounds by implementing suitable 0(d, d) transformations. The new backgrounds 
will also satisfy the equations of motion and they will be acceptable vacuum configurations. In fact 
the 0(d,d) symmetry was discovered for non-constant backgrounds in the context of cosmological 
solutions in string theory [51, 52], when the backgrounds carried only time dependence. One could 
generate new cosmological solutions through 0(d, d) transformations [53, 54]. The applications of 
0(d, d) transformations in the context of black holes was to generate new black hole solutions was 
initiated by Sen [55] and there is a vast literature in this subject [19, 31]. 

Next, we discuss S-duality in string theory. This symmetry relates a theory in the weak coupling 
regime to a theory in the strong coupling domain. In some it is the same theory which gets related 
to itself, like the type IIB theory. In some other situations one theory gets related to another 
one: a familiar example is that heterotic string compactified on is related to type IIA theory 
compactified on Kz- A simple example is the Maxwell electrodynamics. The equations are invariant 
under E B and B -E. However, in the presence of sources, one has to be careful. The 
usual Maxwell equations have only sources carrying electric charges and then the equations are not 
symmetric under the above duality transformations. Thus it is necessary to add sources carrying 
magnetic charges to maintain electric-magnetic duality. This led Dirac to formulate the theory of 
magnetic monopoles. As is well known, the existence of magnetic monopole in the theory leads 
to the famous charge quantization condition: e- g = 27rn, where e is the electric charge and g is 
the magnetic charge. This relation has profound implications; if the theory of electrically charged 
particles is described by a small coupling constant (indeed fine structure constant = ply), then 
the theory describing magnetic monopoles will have large value for such charges corresponding 
to strong coupling constant. In the case of gauge theories with spontaneous symmetry breaking, 
magnetic monopoles appear as classical solutions of nonlinear field equations [56, 57]. Note that 
the electric charge in such theories are obtained from the Noether currents whereas, the magnetic 
charge of ’t Hooft-Polyakov monopoles are of topological nature. The charges respect the Dirac 
quantization condition. Furthermore, the massive gauge bosons (acquiring mass through Higg’s 
mechanism) have masses proportional to the gauge coupling constant, whereas the monopole masses 
are inversely proportional to the gauge coupling constant (electric charge). Consequently, if the 
gauge bosons are light in a SSB theory, the monopoles are heavy; indeed the monopoles have the 
interpretation of being the solitons of the theory. One of the most fundamental contributions to 
developments in S-duality came from the work of Montonen and Olive [58]. According to them, we 
might envisage a dual formulation of fundamental physics where the role of Noether charges and 
topological charges are interchanged. One can visualise that monopoles will appear as elementary 
particles and the W-bosons will be solitonic counter parts. In fact one could check their mass 
formula = Cie^ + g^)] where C is related to VEV of Higgs in SSB theories. In fact W boson 
and photon satisfy this formula, ft a particle had been discovered with magnetic charge this relation 
could be verified. Since it is symmetric under the interchange of e and g and Dirac’s rule tells 
us that e and g are related, one could formulate the theory in the dual picture. However, the 
monopole mass obtained in SSB theory is a classical one and it is subject to quantum corrections. 
Thus, Montonen-Olive idea could not be consistently checked in usual field theories. There are 
special types of supersymmetric field theories where there is no quantum correction to the mass 



Developments In String Theory 


629 


and furthermore, the W-bosons and monopoles belong to the same multiplet. In such cases there 
is the possibility of checking this conjecture. 

We recall that the Yang-Mills theory also admits the introduction of the 0 term in its action. Thus, 
gauge theories have two parameters, the Yang-Mills coupling constant e and the 9 parameter. The 
latter couples to the field strengths as follows: 

(3.29) 

where Note that this term is a surface term and does not contribute to classical 

equations of motion and presence of this term does not affect renormalizability in the perturbation 
theory. It was noted by Witten [59] that in the presence of monopoles, this term shifts the 
allowed values of the electric charge in the monopole sector. Thus we can have electrically charged, 
magnetically charged particles and a third kind of particles carrying both the charges. The Yang- 
Mills Lagrangian can be written in the following form after taking into account the effect of the 6 
term and introducing a complex coupling constant r = -f ^ 

c = + iFniF^u + iF^u]) (3-30) 

Following' qualitative argument tells us about the strong- weak duality group, (i) When 9 goes over 
its period 2tt physics is the same. Thus, we expect that the theory be invariant when r r -(- 1. 
(ii) We also know that, under electric magnetic duality, r One can argue that, when 9 

is arbitrary, the duality group is generated by these transformations. Thus, the duality group is 
identified to be 5L(2, Z). Therefore, in a theory with 5£(2, 2”) symmetry one could check the 
spectrum with charged particles, monopoles and dyons. The complex coupling constant r is often 
referred to as modular parameter or moduli. Moreover, when we discuss strong-weak duality in 
the context of string theory, dilaton and axion will be combined to define the moduli field. As 
mentioned earlier, string theory does not admit any arbitrary parameters as coupling constants. 
All the coupling constants appear as VEV of some scalar fields, i.e. moduli. Therefore, very often, 
the term coupling constants and moduli are used interchangeably in string theory. 

As mentioned earlier, the mass formulas are protected from quantum corrections in supersym- 
metric theory. Moreover, some of the solitonic solutions in the supersymmetric theories satisfy 
special properties: (i) They saturate the BPS bound and (ii) these solutions preserve a part of 
the supersymmetry of the original theory. These attributes play a very important part in testing 
duality conjectures in field theory and in string theory. In order to illustrate the basic point, let 
us consider a two dimensional example due to Witten and Olive [60], where the field content is a 
scalar field and Majorana fermion. The Lagrangian density is 

F = (3.31) 

The potential is arbitrary function of $ and 'prime’ denotes derivative with respect to As was 
the case in worldsheet supersymmetry, we can work in terms of chiral components of fermions and 
the two super charges are 


c+ = 

J 


(3.32) 

Q-= , 


(3.33) 


In light-cone variables (3| = P±, with P± = Po ± Pi and it turns out that {(?+, <5_} = 0, in most 
of the case. However, careful analysis shows that the anticommutator, is proportional to a surface 
integral 


{g+,g_} = 2 j 


(3.34) 



630 


J Mahajana. 


and = V'(^). This surface integral does not necessarily vanish when one considers solitonic 

states. If we denote the R.H.S. of (3.34) by the operator T, then it can be evaluated for the case 
at hand. Now the algebra of charges are different from usual case and one can write 


P++P-=T + {Q+-Q-f 

(3.35) 

p+ + p_ = -r+(g+ + Q-)^ 

(3.36) 


The R.H.S. of each equation above has a piece which is a complete square and we have P4. -h P_ > 
|r|. If we consider single particle of mass M and go to its rest frame = M; we arrive at 

M > \T\ (3.37) 

The bound will be equality when we have states, \s) such that (Q4.+(3_)|s) = 0 or (Q^— Q_)|s) = 
0. The bound on M is the Bogomolny bound. The state which saturates it is called a BPS state. 
This bound also can be derived in a Lorentz covariant manner. We note that, for the states 
saturating the BPS bound, only half of the supersymmetries are preserved. In string theory or 
field theories with large number of supersymmetries, the algebra of the charges for a set of charges 
{Qa},oc = 1, ...AT, can be brought to the form 

{Qoc,Q(3} = (3.38) 

This will be possible if there are no states which are annihilated by some of these charges and 
in that case, we shall get supermultiplets as usual. However, just like the soliton case considered 
earlier, if there are states which will be annihilated by some charges then we shall have a situation 
where 

{Qa,Qb} = ^ab, for a,6=l,..M (3.39) 

{Qa.Q0} = 0, a, /? = M + 1, ,.,N (3.40) 

So we see that these states will be lower dimensional representations since M < N. Again, citing 
the example of two dimensional case, we can state the general result that when there are soliton 
like states getting annihilated by some of the supercharges, then the symmetric matrix {Qa^Q^} 
will have some zero eigen values. The charges (analog of T) and masses get related in the process. 
This is true for monopoles in 4-dimensional theories. The string effective action is defined in 10 
dimensions and one can seek solutions for extended objects in space and there are BPS states in 
this regime too. 

Let us compactify the heterotic string effective action on P® to come to a four dimensional the- 
ory. As mentioned earlier, the T-duality group is 0(6, 22) with scalars parametrizing the moduli 
0(?) bosons, graviton and antisymmetric tensor The four dimensional 

effective action for ihe heterotic string, following the prescriptions of [49], can be obtained in a 
straight forward manner. The T-duality invariance is manifest when we are in the string frame 
metric with shifted dilaton 0 — l-ln detOa/?- However, when one considers the S-duality properties 
of the theory, it is convenient to go over to the Einstein frame metric, through the conformal 
transformation, In string theory, all the coupling constants are related to the VEV 

of the dilaton and therefore, in order to identify the parameters of S-duality group, we have to 
choose the field whose VEV will coincide with the 9 parameter. Notice that the field strength 
of antisymmetric tensor, has only one degree of freedom in four dimensions when we fix all 
gauge freedoms. In fact, if we take dual of this field, it is a pseudoscalar particle and that is what 
we need, an axion. The starting point is the four dimensional effective action [64] with Einstein 
frame spacetime metric 

5W = j^dx - \g^''d^4>du<t> + C 2 + | (3.41) 

with £2, £3, and £4 defined as follows 

£■2 = . 

o 


(3.42) 



velopments In String Theory 


631 


-^3 = 


(3.43) 


A = (3.44) 

Here we closely follow the notation of [49] and [64]. The next step is to perform a duality trans- 
formation, which replaces the field by a scalar field x- This is achieved by first forming the 
equation of motion 

= 0 (3.45) 

and solving it by setting 

^ = 'ye^^^^dxx (3.46) 


where x is the “axion” and 7 is a constant to be fixed later. In the language of differential forms, 


H = * dx (3.47) 

or, using H = dB - 

dB = -rjijAiP + * dx (3.48) 

The Bianchi identity {(fB = 0) now turns into the x equation 

* dx) = 0 (3.49) 

or, in terms of components, (choosing a convenient value for 7 ) 

d, (e2^^/=i p'“'9.x) - = 0. (3-50) 

This is an equation of motion if we replace the £4 term in by 

s^ = - 1 dx^(le^^g>^''d^xduX + , (3-51) 

where 

. (3.52) 

Let us briefly recapitulate the steps we have taken to modify the four dimensional action in 
the Einstein frame. The field strength appearing in C 4 is traded for the pseudoscalar axion, 
X- The resulting action (3.51) contains not only the kinetic energy term of the axion, but also the 

topological term which is like the 6 dependant term of the Yang- Mills action if the VEV of x Is 

identified with that parameter. 

Let us now regroup the terms in the dual action in the following way: 


5 ^'^^= f dx^/^[R-\- C 2 ) Sd + Sf , 

Jm 


(3.53) 


where 


Sd = j dx^/^g^'' ^d^4>du4> + e^*d^xdu)^ 

^ dxV^ ( e -^^2 


(3.54) 

(3.55) 


and contains the usual Einstein-Hilbert action and 

the part coining from kinetic energy term of the M-matrix. We have rearranged the actions coining 



632 


J Maharana 


from dilaton kinetic energy, gauge field part and the axionic part (together with the ’topological’ 
term) to define Sjo and Sf so that dilaton and axion are put together and the gauge field kinetic 
energy along with the topological term are clubbed together. This is very useful to study the 
S-duality properties of the action. In order to describe the SL(2, R) symmetry of the dilaton and 
axion kinetic terms, let us introduce a complex modular parameter (recall the case of Yang-Mills) 

r = x + ie-^ (3.56) 


which has the nice property that under a linear fractional transformation 

ar -f 6 


T 


cr + d 


the combination 


. - + e'^%xd.x) 

is invariant. It immediately follows that 

g(‘''du,Td^f 


Sd = -\ I 
^ Jm 


(Im r)^ 


(3.57) 

(3.58) 


(3.59) 


Now we consider the gauge field action , Notice that the SL{2yR) transformations give 
rise to an electric-magnetic duality rotation. Let us define 


Then, using the identity = 0, we can express Sf as 

The Afj^ equation of motion is 


fJLV 




(3.60) 


(3.61) 


(3.62) 


and the Bianchi identity is 

(3.63) 

To exhibit SL{2, R) symmetry it is necessary to have transform at the same time as t. The 
appropriate choice is to require that 7^^ transform as modular forms as follows 


-> {cT + d)J^^ , [cr + ■ (3.64) 

This implies that 

+ 6) , fT~^ (of + h)T ~^ . (3.65) 

Thus the equation of motion (3.62) and the Bianchi identity (3.63) transform into linear combi- 
nations of one another and are preserved. In particular, the negative of the unit matrix sends 
This result is acceptable if we identify the symmetry as SL{2,R). Note that 
5L(2, R) is not a s 3 Tnmetry of the action. The transformation in (3.64) is a nonlocal transforma- 
tion of Ap, and such transformations can do strange things to the action. For example, the total 
derivative T ■ T transforms into an expression that is not a total derivative. 

Thia far we have focused the attention to dilaton-axion system and the gauge field part of the 
action. The explicit checks show that the rest of the equations of motion are invariant under 
S-duality transformation. While checking the invariance of the Einstein equation we must ensure 
that that the contribution of Sp to the energy-momentum tensor is SL{2,R) invariant. After 
a short calculation one finds that only terms of the structure survive, and these are 

invariant since ^ [cr -t- d\-^e-'>'. The symmetry of the equations motion is SL{2,R). Notice 
that the axion couples to the topological density term, product of and its dual. We can argue 



Developments In String Theory 


633 


qualitatively that the part of the 5L(2,i2) group which gives rise to the translation symmetry 
of the axion (VEV of x is the 9 angle) should break down to discrete group of translations due 
the instanton effects. A more careful analysis is necessary [42] to show that SL{2, R) breaks to 
SL{2,Z). 

The low energy string effective action, in four dimensions, contains graviton, antisymmetric ten- 
sor, dilaton and nonabelian gauge bosons. Furthermore, the Poincare dual of the three form field 
strength is a pseudoscalar and this field can be identified as the axion. One can combine dilaton 
and axion to form a doublet of the S-duality group 5L(2, R). It was argued [61, 62] that S-duality 
is an exact symmetry of the string theory. Schwarz and Sen [63] provided a general formulation of 
S-duality in string theory. Indeed the heterotic string compactified on T® has the effective action 
of N = 4 supersymmetric theory. How one can test S-duality in this case. One of the important 
results in this direction was first derived by Sen [96] when he showed that there are certain dyonic 
states in the theory whose existence can be demonstrated using S-duality transformations on het- 
erotic string actions. These states precisely coincide with the ones we expect from Montonen-Olive 
conjecture. The theory has electrically charged states and magnetically charged states and each 
is 28-dimensional vector for the heterotic string. Due to nonrenormalization theorem of = 4 
supersymmetric theory, the electric charges are not renormalised. Moreover, the spectrum of the 
magnetic charges are fixed by the generalized Dirac quantization condition; the magnetic charges 
are not renormalised either. Thus, spectrum of theses charges will be same as in the tree level 
theory. Indeed, the multimonopole moduli could be computed for the heterotic string [96]. In fact, 
the study of nonperturbative aspects of supersymmetric Yang-Mills theories took new directions 
through the works of Seiberg and Witten [66] in sequel to Sen’s work. 

It is interesting to look for extended objects which appear as solution to equations of motion of 
string effective action. Simplest extended object is a string which is one dimensional. Let us 
denote the worldsheet coordinates of this string as and and the spacetime coordinates as 
{x^}. This should appear as solution to string effective action. Suppose, we consider a frame 
where lie along the spacetime coordinates respectively. We look for a ‘spherically 

symmetric’ solution such t hat the solut ion is static and it depends only on the magnitude of the 
transverse distance, r = -h ... -h yj where x^...x^ are denoted as ^i’s.The effective action has 
graviton, dilaton and antisymmetric tensor fields. In the Einstein frame the action has the form 

= d^°xV=d[R - (3.66) 

The macroscopic string solution which was identified with the heterotic string [67] is obtained for 
following background configurations 

^ (dx^)^) -h f^dy^dy^ (3.67) 


B«i = j 

The rest of the components of 2 u:e set to zero and 


/ 



(3.68) 


(3.69) 


Here Q is the charge carried by the string and it is associated with antisymmetric tensor field. The 
field equations one needs to satisfy are: Einstein equation, dilaton field equation and axionic charge 
conservation which follow from field equation of H. If we look at field equation carefully there is a 
delta-function singularity at r = 0 in the Laplace equation Therefore, it was proposed [67]to 
resolve this problem b^ introducing a source for the string which will be the cr-model action 

S, = ^j dH[d‘'X>^daX''G^, + e'^’’daX>^dtX''B^,] (3.70) 

Here of course the metric is the string frame metric. This is the string solution carrying 
‘electric’ charge and this charge can be obtained from the conservation law. Indeed, q = « T/cjj 



634 


J Maharaaa 


where ur refers to the volume of In the supersymmetric case, there are BPS saturating solutions 
and here mass per unit length is equal to the charge. 

In four dimensions the dual of electromagnetic field tensor is also a two form, thus if we have 
point particles, the dual objects are point-like (’t Hooft-Polyakov monopoles look point like at 
large distances). However, if we have a string in ten dimensions it couples to 3-form field strength 
the dual of that field strength is 7-form. Therefore, the solitonic object for the string is a 5-brane, 
extended in five spatial dimensions. In fact the p-brane solutions were found in sequel to the string 
solutions [ 68 ]. As in case of monopole solution, we do not have magnetic source term while looking 
for field equations (W-bosons carry electric charge), the solitonic five-branes solutions are derived 
without adding a source term. Moreover, if 62 is ‘electric’ charge of the string and ge is ‘magnetic’ 
charge of soliton, the Dirac quantization condition is 

^296 = 27rn (3-71) 

One has to be careful in deriving strong-weak duality relation here. The coupling constant is 
determined in terms of dilaton expectation value. The relations are 62 = and ge = 

There are special type of extended objects, the Dp-brane (D-branes), which carry R-R charges 
[69]. The type II theories admit gauge fields from the RR sector. The corresponding effective 
contain these fields. If one look for p-brane solutions with these gauge fields: strings, membranes 
and so on, they have interesting properties. These are hypersurfaces or spacetime defects on which 
the open strings can end. In D-dimensions, if there is a Dp-brane, there are Neumann boundary 
conditions satisfied in (p-j-l)-directions, these are the directions of the world volume coordinates 
of Dp-brane and we have Dirichlet boundary conditions along the remaining transverse directions 
that is (D -p—l) coordinates. Written explicitly, 

d^X^=Q, for ^ = 0,...p (3.72) 

X^((j = 0,7r) = a^, for /i=p+l,...9 (3.73) 

A Dp-brane will couple to (p -h 2)-form RR field strength; therefore, DO-brane is a particle, Dl- 
brane is a string and so on. The corresponding fermions satisfy boundary conditions in accordance 
with the bosonic fields in order to maintain the worldsheet supersymmetry. The BPS saturated 
solutions, then preserve half of the supersymmetry. Prom our earlier discussions, we note that type 
IIA admits DO-brane and D2-brane (their dual objects too) and IIB string has D-string, D3-brane 
and D-instantons, along with the duals. Thus, we conclude that IIA has even D-branes and odd 
D-branes belong to IIB theory. Of course, we are discussing the 10 dimensional case. The D-branes 
are dynamical objects and there are excitation of such extended objects since open string ends are 
attached to them. 

Consider a situation when two D-branes are separated from each other. Since open string ends can 
get attached to this surface, they will be connected by open string/ strings. The farther apart the 
two branes, it will cost more energy to stretch the open string. More interesting is the configuration 
when D-branes lie on top of each other. Then we can visualise an open string starting from a brane 
and ending on it, open string starting from one brane and ending on another coincident brane. In 
this situation we have massless states since there is no stretching of strings. Open strings contain 
masless vector state in their spectrum. One can incorporate nonabelian gauge symmetry for such a 
theory by introducing the ChanPaton factors. We can imagine a scenario where a quark belonging 
to representation i of U (n) is attached to one end of the string and an antiquark in representation 
j attached to the other end. Thus the gauge field will carry index i and j like usual Yang- Mills 
fields and these are called Chan-Paton factors. This characteristic of open string turned out to be 
tiseful when we consider coincident D-branes. Therefore, if there are N coincident branes, we get 
U{N) Yang-Mills action, in fact we get supersymmetric gauge theory on the worldvolume of the 
brane. 

Let us discuss some of the implications of dualities in the context of the branes we just introduced. 
The experience from monopole solution is that the charged particle couples to the field strength 
tensor and the soliton couples to the dual tensor in four dimensions. In ten dimensions, the solitonic 
counter part of string is five brane and we saw that couplings are not really reciprocals of each 



Developments In String Theory 


635 


other. If we consider six spacetime dimensions, then we note that dual of 3-form field strength is 
also another 3-form tensor and string couples to this tensor. Therefore, the conjecture is that in six 
dimensions there is string/string duality. If there is a fundamental string the solitonic counter part 
is a string too and their coupling constants satisfy the reciprocal relation. For simplicity, consider 
a six dimensional reduced action, with only metric, antisymmetric tensor field and the dilaton [70]. 

^6 = ^ / d^xyTGe-^iRa + id4>f - (3.74) 

Where Gmn is six dimensional metric in string frame and Hnmp is the 3-form field strength 
associated with Bmn and it is understood that H is defined up to Chern-Simons terms. We can 
go over to Einstein metric by the relation Gmn = <l> being the dilaton in six dimensions. 

Let us consider the dual six dimensional action 

4 = 2^ I cfx^/^e-^[Ra + {d$f - (3.75) 

Here ^ is the corresponding dilaton and H is the field strength of the 5, 2-form potential of the 
dual theory. The two actions (3.74) and (3.75) are related if we identify 

(f)=-^ and H = (3.76) 


The two metric being identified to be equal. Here * stands for Hodge dual. Note that just as in 
case of gauge field kinetic energy term in four dimensions is conformally invariant, the term is 
also conformally invariant in six dimensions and it is immaterial which metric we use while taking 
Hodge dual. As noted earlier, the fundamental string solution with action(3.74) can be obtained 
by adding a (j-model source term with coupling of the G and B backgrounds. The solution is given 

by 2 2 

ds^ = (1 - + (a:^)^ + (1 - (3.77) 

= 1 - 4 (3.78) 

e-^ 2^ €3 (3.79) 

with 

= — (3.80) 

Of course we have the BPS saturated mass relation 

M = T<e^ > (3.81) 


Therefore, the mass density gets heavier as string coupling proceeds towards strong coupling do- 
main. The source free action (3.74) also admits solitonic string which is nonsingular and the 
solution is „ 


ds^ = -f {dx^)^ + (1 — ^) ^dr^ -f- r^dUl 

(3.82) 


(3.83) 

Hz = 2fi3 

(3.84) 

^ The mass density is 


M = f<e'^> 

(3.85) 


In the weak coupling regime this string is heavier as one expects of a solitonic string. Notice 
that the solitonic string differs from the fundamental string by the replacement (j) ->■ -4>,Gmn -> 



636 


J MaharaiicL 


Gmn, H ^ H = e~^ ^ ^ a' . The Noether charge and the topological ‘magnetic’ charge are 

respectively given by 

^2 = f and g 2 = — ^ f H 3 (3.86) 

V2k Js^ V2k Js^ 

The Dirac quantization rule for charges: 62^2 = 27 rn gets translated to relation between tensions. 
Moreover, the fundamental string and dual string saturate Bogomolnyi bound for mass densities 
and break half of the supersymmetry as expected. These solutions have the interpretation of being 
limiting cases of more general solutions. They can be viewed as extreme mass equals charge limit 
of two-parameter black string solutions. 

Again the question arises where can we test the string/string duality ? It has been conjectured that 
[71, 72, 73] heterotic string compactifies on is S-dual to type IIA theory compactified on K 3 . 
When heterotic string is compactified on the theory has charged states saturating Bogomolnyi 
bound. On the IIA side, elementary string states are neutral since the gauge fields arise from RR 
sector. Moreover, for type IIA, the analysis of the Bogomilnyi formula tells us the charged states 
(under gauge fields) have their masses as implying that these are solitonic states. The duality 

between heterotic and type IIA is understood in the following sense[74, 75]: In type IIA theory, 
there are nonsingular soliton solutions and these carry quantum numbers of fundamental heterotic 
string. The properties of those strings are consistent with those of the heterotic string. On the 
other hand the hetetotic string admits solitonic solutions carrying the quantum numbers of type 
IIA string. Moreover, we know that the moduli of heterotic string compactified on parametrize 
the coset When type IIA is compactified on IC 3 , the moduli also turns out to be 

exactly the same. Therefore, there is a very good evidence for this heterotic - type IIA duality 
conjecture. Another duality relation, that has been verified, is toroidal compactification of IIA 
and IIB theory via T-duality. Again the simplest one being compactification on 5^. If one theory 
is compactified on circle of radius iZ, it is equivalent to the other theory compactified on circle of 
reciprocal radius [76], although in ten dimensions these are two different theories. Some of the 
important consequences of S-duality can be examined in type IIB theory. It is conjectured that type 
IIB theory is self-dual and the effective action can be cast in a manifestly SL(2, Z) invariant form. 
We shall study this aspect in the next section. The two heterotic strings i.e. 50(32) and E 3 x E 3 
when compactified on are T-dual to each other in the reciprocal radius sense that one theory 
compactified on a circle of radius R is equivalent to the other which is compactified on a circle of 
radius Finally, we comment that heterotic string with 50(32) gauge group is S-dual to type 
I theory with 50(32) group. The heterotic string effective action, with 50(32) gauge group has 
the following form 


Shet = j d^°x^f^[R - - je ^ ( 3 - 87 ) 

Here is the nonabelian field strength and H = dB, We work in the Einstein frame as it is 
the mc«t convenient frame to study S-duality properties, since this metric remains invariant under 
S-duality. This action is obtained after rescaling the backgrounds and the slope parameters. The 
type I string has graviton and dilaton coming from the closed string NS sectors and closed string 
RR sector gives the antisymmetric tensor. The gauge fields come from NS sector of the open string 
and they have to be in the adjoint representation of 50(32). Again with appropriate scalings the 
effective action can be brought to the following form 

Si = j d^°^g{R - (3.88) 

Hare all the fields of type I theory are defined with ‘bar’ to distinguish from those of heterotic 
string theory and the metric is in Einstein frame. Now, the comparison between the two actions 
shows that they will be identical if 

(p Ap = Ap (3.89) 



Developments In String Theory 


637 


Thus, if we compare the two actions, (3.87) and (3.88), we see that the two theories are related 
to each other by strong- weak duality in 10-dimensions, since g^^j. = There are host of duality 
relations among various string theories in diverse dimensions; we refer the interested reader to 
large number of review articles in this area. 


4 M-theory and Unified String Dynamics 


We have briefly introduced some of the essential features of string theory and their symmetry 
properties. There are five perturbatively consistent string theories and one of their most attractive 
attributes is that they describe quantum gravity which is perturbatively finite and unitary. The 
dualities are powerful symmetry properties which provide important information about intimate 
connections between string theories. We have seen that one string theory, in a spacetime dimension, 
is related to another string theory either through T-duality or by the S-duality. When two theories 
are S-dual to each other, we can study strong coupling regime of one theory by going over to 
the weak, perturbative domain of its dual theory. Therefore, the nonperturbative aspects of some 
of the string theories could be investigated by these powerful tools. However, we still have five 
string theories. Therefore, the natural goal is to search for a theory which will provide a unified 
description of all the five string theories. The zero slope limits of the string theories yield all the 
known 10-dimensional supergravity theories. However, there is the D == 11 supergravity theory 
consisting of graviton and 3-form potential, endowed with total 128 bosonic degrees of freedom, 
and the 128 fermionic degrees of freedom. It was shown several years ago [77] that compactification 
of 11-dimensional theory on a circle gives rise to iV = 2 supergravity theory in 10-dimensions. It 
was not possible to establish any relation between the 11-dimensional theory and any string string 
theory for a long time. The connection of = 2, lO-dimensional supergravity with string theory 
is rather transparent since the supergravity actions can be obtained in the zero slope limit of 
corresponding type II string theories. There was no string theory that could be related in some 
such limit to 11-dimensional supergravity. Therefore, if 11-dimensional supergravity were to have 
any connection with one of the string theories, then only the nonperturbative regime of a theory 
will show the inter-relation. Moreover, when one views from the 11-dimensional perspective, the 
supergravity theory does not have any small parameter, like in string theory, which can be 
chosen to take small value as an expansion parameter. 

The connection between type IIA string theory and 11-dimensional supergravity were recognised 
by Witten [73] and Townsend [78] following the developments in string dualities. The massless 
bosonic sector of the type IIA theory, we might recall from our discussions of Section II, consists 
of diaton, graviton, and gauge field, A^, antisymmetric tensor, coming from the NS 
and Ramond sectors respectively. The effective action of type IIA theory 


SjIA 


Jj- / ^»xV=5[e-*(« + - lff=) - (iF> - iFf )] 

-A- f F4 A F4 
4^10 J 


( 4 . 1 ) 


We have suppressed the Lorentz indices of the field strengths and we shall define them now: R is 
the scalar curvature, is the field strength of from the NS sector, is the field strength 
of RR gauge potential and in form notations, 4-form field strength, F4 = dCz +AAdB; C 3 being 
the 3-form potential coming from the RR sector and B is the 2-form potential whose field strength 
is H. Last term in (4.1) is the Chern-Simons term, and F4 = dCz is the antisymmetric 4-form 
field strength of potential G3 A few remarks are in order at this point: the metric used in action 
(4.1) is the string frame metric. Note that the factor multiplies only R and piece; fields 
coming from the NS sector. The reason is that in the worldsheet supersymmetric formulation of 
NSR type II theories the R-R sector fields through local worldsheet interactions (in NS sector the 
worldsheet fields couple to potentials), couple via bilinears of spin fields (in fact to field strengths). 
As a consequence, there are cuts and the usual arguments that tree level term starts with does 
not go through. Thus we see this mismatch of between NS and RR fields in the effective action. 



638 


J Maharana 


Now, it is easy to see that this theory will admit DO-brane and D2-brane and their duals will be 
D6-brane and D4-brane from RR sector and a string and its dual five brane from the NS sector. 
Let us consider the bosonic part of the eleven dimensional supergravity action 

- t4 / 

Here the field with tilde belong to bosonic components of 11-dimensional supergravity. Let us 
compactify one of the spatial dimensions on following the procedure outlined in the last section. 
There will be a gauge field and a scalar field, when the metric is expressed is decomposed in terms 
of the metric of the 10-dimensional theory. The 3-form potential will decompose into a 3-form 
potential but with additional piece according to the procedure of [48, 49] and a two form potential 
will appear as well. It is most convenient to express the 11-dimensional metric in the following 
form 

^ -^4) f Gfiv -t- e'^AftAv e'^Afj, 

The dimensional reduction of (4.2) goes over exactly to the type IIA action (4.1). Note that if 
we had not adopted this form of the decomposition of the 11-dimensional metric with the over all 
factor of e“3<^ and the factors of in various places inside the matrix; but had compactified on 
a circle of radius, say, IZ] we would have obtained a reduced action with 10-dimensional metric, 
the moduli Tl and the antisymmetric tensor potentials (2-form and 3-form) with appropriately 
modified C-S terms. The resulting action in ten dimensions would need some field redefinitions 
to match with the type IIA action. Let us see how the radius of compactification Ru is related 
to type IIA string coupling constant gi^K Note from (4.3) that Rh = (e^t^^)^ and by definition 
Therefore, we conclude that 



i2n = (5if.¥ (4-4) 

Therefore, in the perturbative regime of the type DA. theory, the radius of compactification of the 
11-dimensional theory is very small. When we want to go over to the decompactification regime 
i.e. large radius limit of 11-dimensional theory, we can’t realise that domain since it is the strong 
coupling phase of the type IIA theory and perturbation theory does not provide any clue for the 
existence of the 11th dimension in the ten dimensional theory. The correspondence established 
between type IIA theory and 11-dimensional theory is at the level of the effective action. The 
11-dimensional supergravity has a 3-form potential in the bosonic sector and the natural extended 
object is a membrane. The 10-dimensional theory admits a string as a fundamental object and 
supergravity action is zero slope limit of the string theory. How can one establish the relation 
between membrane and the string ? The idea of double dimensional reduction provides an impor- 
tant clue. One can envisage a situation where we start from a membrane in eleven dimensions and 
compactify 11th dimension on a circle. Then, according to the prescription of double dimensional 
reduction [79], the membrane wraps around the compact direction so that the end result is the ten 
dimensional string. 

We have described in the previous section how one can establish connections among the five string 
theories various dimensions through duality transformations in different spacetime dimensions; 
although there are five distinct ten dimensional theories when viewed in the perturbative frame 
work. The 11-dimensional theory is also recognised to play an important role in string dynamics. 
It is believed that there is an underlying fundamental theory, yet to be discovered, so that the 
manifestations of the theory in its various phases are realized through the string theories. It is 
postulated that in the low energy limit, we should derive the ll-dimensional supergravity action as 
an effective theory. The unknown fundamental theory is named U-theory. Since the ll-dimensional 
theory naturally admits membrane as a fundamental extended solution, it has been argued that 
the underlying fundamental theory is a theory of membranes. The M-theory is taken to be the un- 
derlying theory. We shall illustrate, with a few examples, that starting with an eleven dimensional 
theory with membrane, how one can obtain a host of relations about the structure of branes in 



Developments In String Theory 


639 


various string theories. 

Since the BPS states do not get any quantum corrections, it is interesting to look for BPS states 
and then propose tests for the theory. When we compactify M-theory on a circle, the momenta 
in that direction will be quantized and we shall get towers of KK massive states. These states 
will fall into representations of the 11-dimensional supergravity. In fact they are BPS states. In 
the KK reduction, the charge of a state (in the lower dimensions) is related to the momentum 
along the compact direction (thus automatically quantized) and in some suitable units the charge 
is proportional to m being an integer. This is the charge associated with the gauge field as 
a result of compactification (4.3). Prom the type IIA point of view this charge is that of gauge 
field coming from RR sector and the whole tower should exist as BPS state. We know already that 
elementary string states are RR-charge neutral and those massive towers belong to RR sector. We 
can identify the state with unit charge, m = 1 as a DO-brane of type IIA theory. The open string 
ends can get attached to DO-brane and act as the collective coordinates to give excitations. One 
can show that IIA theory has those BPS states belonging to the ultra-short multiplets and these 
also correspond to the states counting done from M-theory side. Therefore, we notice that duality 
between type IIA theory and M-theory is established for such states. In case of m > 1, the test is 
not so simple. One of the properties of BPS states is that the binding energy for composite BPS 
states is zero. That means, if we have a single DO-brane, a BPS state with m units of charge, we 
can’t distinguish it from collection of m BPS particles each carrying unit charge. Thus a test for 
the general case is rather difficult. 

The relation between M-theory and type II theories can be established by exploiting the duality 
relations. Note that type IIA and type IIB theories are T-dual to each other when one of the 
directions is compactified. Since M-theory with one compact direction, 5^, is related to type IIA, 
therefore, M-theory with two compact dimensions, compactified on T^, is expected to be intimately 
connected [80] to type IIB with one direction compactified to S^. We shall see that one needs to 
exploit the SL(2^Z) S-duality symmetry of type IIB theory in this context [81]. The type IIB 
theory has graviton, 2~form antisymmetric potential, and dilaton, (j> in the NS sector and 
2-form potential, axion, x and 4-form potential, Dfj,upX in the RR sector; the field strength of 
D-field is self dual. For our purpose, it suffices to drop the D-field from considerations presently. 
The action is 


Sstr = d^°xV^[e-^R+{dcj>)^ - - \{dx? 

(4-5) 

This action is written in the string frame metric. It is useful to go over to the Einstein frame 
by the conformal transformation. Furthermore, to write the Einstein frame action in a manifestly 
SL{2, Z) invariant form, let us define 


M = 


(X^ + e-<^ 




Then the action, 

SE = ^f d^°x[Rs + 


This action is invariant imder the transformations 


(4.6) 


(4.7) 


M AA4A^, H ^ (A^) and (4-8) 


If one looks for a string solution in this theory then the solutions will be of three kinds: strings 
carrying NS charge, strings with RR charge and ones with both NS and RR charge. The procedure 
adopted in [81] is as follows: first look for a string solution with NS charge such that asymptotic 
values of axion, xo == 0 that of dilaton (^o = 0. In the language of complex moduli introduced 
earlier, asymptotic value of tq = i . Moreover, one starts with = 0, since one is looking for a 



640 


J Maharana 


string C3.rryiiig NS charg© only. Next introduce a- specific SL{2, Z) transformation such that the 
resulting string carries both types of charges; the relevant matrix is 



Although the string carries both types, NS and RR charges, still the modulus preserves the 
asymptotic value, rn = 2 . Finally, introduce a general SL(2^ Z) transformation so that tq will take 

arbitrary value as a result of the duality transformation. The matrix is A = ( q /2 j • 

As a consequence of the SL(2, Z) transformation, not only we have strings which carry charges 
(gi, gs), but also the tensions of the strings depend on these charges; after all these are BPS strings. 
The formula for the tension of string with (^ 1 ,^ 2 ) charges is 

T, = - qif + (4.10) 

where T is the tension of the NS string one started with for which (po = Xo = 0 i.e. tq = 2 . Since 
we consider SL{ 2 ,Z) transformations, gi and q2 should be integers. For stable strings, (gi,g 2 ) 
should be relatively prime; otherwise these string will decay into multiple strings. If we compactify 
this theory on 5^, more interesting results follow.The spectrum of the nine dimensional theory is 
governed by the mass formula 

Ml = + (27ri?r,n)2 + 47r(JVi + Nr) (4.11) 

Here we have explicitly kept the tension term showing that when the string winds ‘n’ times it 
stretches by its perimeter and energy is obtained by multiplying the tension Tq. The last term 
is sum of contributions from left: and right oscillators. The level matching condition tells us 
Nl — mn. The BPS saturating multiplets have either Nl = 0 or Nr = 0; ultrashort 
corresponds to both being zero. If we choose Nr = 0, then mass formula is (also level matching 
relation is used) 

Ml = {2irnBTj + (4.12) 

Jtt 

We have a rich spectrum and these masses should remain protected from any quantum corrections. 
We can describe the same phenomena by compactifying the M-theory on T^. There is membrane 
in M-theory with tension Tn and if it wraps m times on a torus of area An, then the contribution 
to mass will be of the form mAnTn. But the area is An = (27rJ?ii)^p2, P2 being the modular 
parameter of the torus and p = pi + 2^2 the area is computed using 11-dimensional metric. Since 
we are considering compactification of the M-theory on the wave function of the two dimen- 
sional Laplacian (corresponding to two coordinates on the torus) must satisfy periodicity property 
appropriate to the torus and the mass formula should be suitably generalised with respect to mass 
formula for a string compactified on a circle. 

Mfi = [m(27riZn)'/>2Tii]' + [/| + —{h - kpi)^] (4.13) 

itil P2 

I 1 J 2 are integers which enter the mass formula as the contribution to the KK part since the two 
dimensional Laplacian — dy acting on the wave function 

i 1 

= exp{~[xl2 + —yih - hpi)]} (4.14) 

illl P2 

It is easy to see the periodicity property of the wave function by defining z = {x + iy) ftirRu, since 
the invariance now translates to z z + 1. and 2 — > z + p In order to compare the above mass 
formula, obtained firom M-theory with compactification, with the corresponding one (4.12) from 
type HB in S-dimensions, we should recognize that (4.14) is derived using 11-dimensional metric. 
Therefore they could differ from each other by a multiplicative constant: Mu = CMr- Now the 



Developments In String Theory 


641 


exact matching of the mass formula implies that the the modular parameters of T^, denoted as 
p, should be identical to the parameters of SL{2,Z), r. Thus the modular group appearing in 
compactification of M-theory , SL{2,Z) is identical to the duality group, 51/(2, Z), of type IIB 
theory.The following relations should be satisfied for matching of (4.12) and (4.14) 


= and 


27rRne-'f’°^^Tu 

T 


(4.15) 


Since type IIA theory, in 9-dimensions, is related to IIB theory by T-duality, we can also get some 
insight into DO-branes in IIA theory. The mass spectrum of these (point) particles can viewed from 
two perspectives. One way is to identify the winding modes of the family of type IIB strings on the 
circle with the KK modes of the torus; and the other way of looking is to identify KK modes of the 
circle with wrapping of the membrane on the torus. Again the mass formula matching relations 
can be used to relate parameters on both sides. 

There is also a five brane, the soliton counter part of fundamental membrane, in the M-theory 
and these are the only two extended objects in the 11-dimensional theory. Therefore, one expects 
that M-theory should be able to give description of NS branes and Dp-branes in the lower dimen- 
sional string theories. Moreover, the membrane tension of 11-dimensional theory, is the only 
parameter since the 5-brane tension, is determined from the Dirac quantization relation in 
terms of T 2 ^\ In order to study the branes in 9-dimensional theory, we should see how different 
branes arise from M-theory and from type IIB theory. The simplest is the 2-brane. In case of 
M-theory, the membrane remains a membrane; but type IIB in 10-dimensions has no membrane 
(due to the absence of even RR field strength), therefore, the D3-brane of ten dimensional IIB 
theory wraps around 5^ to produce a membrane. But the type IIB string tension also is related to 
since membrane wraps around torus to produce the string. So the simplest result is D3-brane 
tension and string tension of type IIB are related: this result involves only IIB 

theory tensions derived through M-theory route. The 9-dimensional IIB theory will have D3-branes 
too. They will arise, from M-theory view point, as wrapping of the 5-brane around T^. There are 
4-branes in 9-dimensional IIB theory. Since lO-dimensional IIB theory has 5L(2, Z) pair of strings, 
their sohtonic partners 5-branes will come in same multiplets too. These 5-branes, compactified 
on 5^, will give the 4-branes in 9-dimensions. Similarly, one can discuss type IIA theory from 
both the perspectives in ten dimensions. K we define L = 27rRii as the perimeter of the compact 
circle, then tension of IIA string gets related to since IIA string, in 10-dimensions, will arise 
due to wrapping of membrane around the circle. The relation is LT 2 ^\ For type 

IIA membrane we have The 4-brane in IIA theory will come from wrapping of 

M-theory 5-brane around the circle and the relation becomes and we must also 

have Then using relation between and together with the relation 

between and one can get an expression: 

The purpose of above examples was to illustrate how one can derive a large number relations us- 
ing the M-theory. In lower spacetime dimensions, the theory provides a rich basis to understand 
branes coming from various theories. We would like to record one important fact for our future 
considerations. Note that the formula (4.10) of general ( 51 ,^ 2 ) string is for a string carrying NS 

and RR charges. For a string with only NS charge the (1, 0) the tension scales as T ^ and 

for the one carrying only unit RR charge it is T ^ . In the string frame, after rescaling the 

metric, we find that a string with one unit of NS charge has tension of order 1 and the string with 
one unit of RR charge has tension g ^^ . Therefore, the mass density also has same dependence on 
coupling constant. 

There are duality relations which relate compactified M-theory to other string theories. One of the 
interesting cases is Es x Es heterotic string in ten dimensions [84]. There is no other string theory 
which can be related to this one. So it is expected that Es x Es is connected to the ll-dimensionai 
theory. But it cannot be 5^ compactified M-theory, because that compactified theory is type 
IIA as we have seen. Moreover, 11-dimensional theory as such is free from anomalies. However, 



642 


J Maharana 


if one considers compactification on ^ of M-theory it gets related to Eg x Eg ten dimensional 
theory. The orientation of is reversed under Z 2 and it flips the sign of 3-form potential C. 
As a consequence of this projection, we are left with the metric in ten dimension, the dilaton 
and 2“form potential. The gauge boson and 3-form potential C are projected out. The surviving 
fermions are Majorana-Weyl gravitino and Majorana-Weyl fermion. This is the supergravity in 
the bulk. Actually, is a line segment with fixed points at the boundary. These are two copies 
of 10-dimensional flat^space. The states from twisted sector should be localised on these planes. It 
was shown that half of the anomalous variation is localized in one plane and the other half on the 
other plane. The possible gauge groups that can cancel the anomaly are Eg x Eg or It is 

obvious the string theory to be identified is the Eg x Eg heterotic string. There are other duality 
conjectures [85, 86, 87] between M-theory and other string theories in lower dimensions: M-theory 
on Kg -H- heteroticfType J on T^. Compactification of M-theory on ^ is dual to type IIB on Kg, 
The lower dimensional compactifications ^ and ^ are related to Type 1/ heterotic on and 
type IIB on ^ respectively. 

There are attempts to construct gauge supersymmetric gauge theories by choosing suitable combi- 
nations of intersecting branes and establish Seiberg- Witten dualities from this M-theory point of 
view of SUSY Yang-Mills gauge theories. Undoubtedly, the dualities together with the proposal 
for M-theory has brought us nearer to the goal of unified description of string theories. However, 
the underlying fundamental theory is yet to be discovered although we have seen many facets of 
that theory. 


5 Black holes and String Theory 


The physics of the black holes has many fascinating aspects. The classical black hole is the final 
stage of a collapsing heavy star. As the name suggests, matter falls into it and nothing comes 
out; there is an event horizon. However, deeper investigations have revealed, almost a quarter of a 
century ago, that there are strong similarities between thermodynamics and black hole mechanics 
[88, 89]. If M is mass of the black hole, 


dM = ^kdA, SA>0 
ottG 


(5.1) 


Here G is the Newton’s constant, A is the area of the event horizon and k is the surface gravity. 
This is to be compared with thermodynamical relation, 


dE = TdS, SS>0 


(5.2) 


Hawkiiig’s startling discovery [90] that black holes radiate with a black body spectrum of temper- 
ature r = when quantum effects are accounted for, raised several important issues in black 
hole physics. One can also associate entropy with a black hole 


Sbh = 


A 

iGh 


(5.3) 


The thermodynamical relations used to describe macroscopic phenomena can be derived from 
statisticsil mechanics starting with microscopic fundamental laws of physics. Since h appears in 
the black hole entropy formula, it is expected that the microscopic derivation of black hole entropy 
requires quantum gravity calculations. Moreover, entropy of a system, when interpreted from 
statistical mechanical point of view, counts the total number of degrees of freed in the system. 
How do we count the number of degrees of freedom in a black hole and obtain the expression for 
entropy? There are more fundamental issues related to quantum mechanics when we carefully 
examine the implications of Hawking radiation. We can think of allowing some matter to go into 
the black hole, prepare the initial state as a pure quantum state to be the incident wave. However, 
the emitted Hawking radiation has a black body distribution and thus these are mixed states. 
Therefore, the S-matrix that will describe the above process will loose its unitarity property. 



Developments In String Theory 


643 


In the perturbative regime, string theory can provide reliable results for computations of processes 
involving graviton. The resulting S-matrix elements respect the required unitarity and analyticity 
properties. Thus, it is pertinent to ask what string theory has to offer in resolving the issues 
alluded to earlier. Recently, one of the important achievements of the string theory has been the 
microscopic derivation of the black hole entropy, for a special class of black holes that arise in 
string theory. We shall, initially, not set G = 1, to bring out a few salient points in discussions 
of stringy black holes and some times we shall display presence of h in formulas. Recall, that 
the Newton’s constant is related to string coupling and tension as G spacetime 

dimensions. If we have a massive string state, the gravitational field is GM 5 , where Mg is mass 
of a string state measured in units of T; also some times we shall denote it as M. Thus, the field 
increases as string coupling increases. String states are given my the mass formula = NT 
and it is well known that at a given mass there are a lot of states and the degeneracy [92] grows 
exponentially with mass, i.e. e^. Thus one might think that the excited states, if treated as black 
holes, will reproduce the entropy formula; however, this simple argument in not adequate since 
black hole entropy grows like M^, whereas the naive argument will give Sbh ^ There have 
been attempts to explain this discrepancy saying that the mass that would appear in microscopic 
derivation of Sbh is not the same as the one appearing in Beckenstein-Hawking formula and there 
might be renormalization effects to be accounted for [91]. The perturbative string states appear in 
infinite levels and thus, for high enough mass, the massive elementary string state will lie inside the 
Schwarzschild radius associated with it. Consequently, they will require black hole descriptions. 
One of the ways to derive black hole entropy microscopically is to consider such BPS states, so that 
when string coupling gets strong, the state is unchanged. In this approach [93], first step is to pick 
up appropriate BPS state and compute the microscopic entropy. Next, compute the Beckenstein- 
Hawking entropy of the BPS state, it is also an extremal black hole, and verify whether the two 
ways of calculating entropy are in agreement. This is the first clue that string theory might explain 
black hole entropy in microscopic way. However, the black holes constructed from the elementary 
string states had some short comings while computing the entropy. The area of the event horizon, 
for such black holes, tends to zero as one approaches the extremal limit; moreover, the dilaton also 
diverges at the horizon in this limit. This problem was encountered for string states in the NS 
sector. 

The D-brane in RR sector can come as elementary states and there axe corresponding solitonic 
states contained in the full spectrum. We had argued in the context of type IIB 5T(2, Z) strings 
that in string frame metric, NS states have tensions of order 1 , whereas, D-strings had mass density 
of the order of — . For the solitons of NS sector the mass goes as - 2 ^; but the solitons for RR 

sector stiU have mass order In the weak coupling regime NS solitons and RR ones are heavy. 
We should account for the gravitational fields they produce, which is GM. In view of above 
discussions, (i) NS elementary states produce very low field and (ii) RR states also produce low 
field in weak coupling limit; field tends to 0 as g^tr 0. We may argue that in this regime, flat 
spacetime is a good description of the geometry. Since we are dealing with BPS states, as string 
coupling increases the mass remains unchanged, but the gravitational field keeps increasing and 
after some critical coupling, the spacetime is not flat any more; we must employ general theory of 
relativity. If these states describe black holes, then we should be able to compute the degrees of 
freedoms associated with them. It is possible to construct black hole configuration such that the 
area of the horizon in not zero nor the dilaton diverges at the horizon, when we take the extremal 
limit. For five dimensional black holes, we need at least three charges to have nonzero area for the 
horizon together with constant value for the dilaton at the horizon. In case of the four dimensional 
black hole needs four charges in order to satisfy the requirement of nonzero horizon area and finite 
value of dilaton (at the horizon). 

The black holes which we shall consider now have some special characteristics. They can be 
thought of as composites of many D-branes carrying Ramond charges. We have mentioned before 
that the BPS states have the property that mass of composite BPS state is the sum of the masses 
of the constituents. One starts in the weak string coupling phase with such D-branes and proceeds 
towards strong coupling domain when gravity becomes strong. In weak coupling regime, the 



644 


J Maharana 


degeneracy of the level can be estimated reliably and microscopic entropy can be computed. In 
the strong coupling domain, the D-brane is inside the horizon and one can treat this like a black 
hole and compute the ratio which is independent of string coupling gstr since both area and 
Newton’s constant grow like 

Let us discuss how the five dimensional black hole configuration is constructed with D-branes [94]. 
We start with type IIB theory in 10-dimensions. We know that it will admit D 1-string and D5- 
brane. We want to make the composite object heavy; therefore, we put Qs number of D5-branes 
and Qi number of D 1-strings together. Let us compactify this theory on such that the 
number of D5-branes are wrapped around the Qi Dl-strings wrap along one of the directions 
of the torus. Then put some momentum along the direction in which the D-string wrapped; 
this momentum will be quantized in units of inverse radius of 5^. The aim is to evaluate the 
microscopic entropy by counting number of degrees of fireedom for this system and it involves some 
detail technical steps [95, 96, 97, 98]; but we shall outline only essential points. We expect to have 
a U(Q$) supersymmetric Yang-Mills theory on the D5-brane world volume. This will be a gauge 
theory in 5 + 1 dimensions which is derived by dimensional reduction of = 1 supersymmetric 
Yang-Mills theory from ten dimensions [95]. The D-string is inside this pack of D5-branes (Qs 
of them). The D-string can be viewed as an instanton in this six dimensional spacetime, since 
an instanton in 6-dimnsional theory with no time dependence and extension in one direction is a 
string. There are Qi such strings in the D5-brane configuration. Their low energy dynamics is 
described by two dimensional supersymmetric sigma model in AQiQs dimensional hyper Kahler 
manifold. Every boson contributes factor 1 and every fermion contributes | to the central charge 
as we noted in Sec. 11. Thus, total central charge is 

c = 6QiQ5 (5-4) 

Since we are dealing with BPS states, for these stat^ Lq = 0 and the momentum given along 5^ 
is related to the difference £o - Lq. If we take momentum to be large i.e. n large; then 

using Cardy’s result (relating degeneracy to central charge), one gets 

d{QuQi.n) = exp(27rv/<3i<35ra) (5.5) 

The black hole entropy computed from the microscopic view point is given by 

^microscopic ~ 27:^/QlQsn (5.6) 

In order to derive the black hole entropy, Sbh, from Beckenstein-Hawking formula, we have to 
specify the metric, the charges and then compute the area of the event horizon in the extremal 
limit. 

There is way to visualise the physical processes that lead to microscopic [99] derivation of the 
entropy formula. The D-string is inside D5-brane and the low level excitations are the lowest lying 
modes of the open strings attached to this one. If we think of the physical degrees of freedom, 
these are 8 transverse vectors and their super partners. Since these have to satisfy the Dirichlet 
boundary condition, they are constrained to move along the D-string. We are dealing with BPS 
state, therefore, these move only in one direction (say left). Since the D-string is wrapped around 
one circle of T^, we choose xi , then length is winding number times the radius of the circle. But the 
momenta of individual open strings moving on this unidirectional path on the circle is quantized. 
Moreover, sum of their momentum is constrained too by the total momentum we have put on that 
direction. Therefore, this is analogous to solving statistical mechanics of a one dimensional system 
on a circle where total energy (momenta are same as energy) is fixed. 

The next step is to define the metric for the above configuration of the branes and the obtain the 
harmonic functions that are necessary to satisfy the equations of motion for the brane configurations 
[ 100 , 101 ]. 

ds^ = + K{dxi - - l)dtf) 

+Hf^{dxl ^ + dxl) + dxl H (- dxg 


(5.7) 



Developments In String Theory 


645 


We specify the compact directions as follows: the Qs number of D5-branes are wrapped in 2:1 , X5 

directions, D-string is wrapped in xi and the momentum is along xi too. Since we toroidally 
compactify to five dimensions Xi, z = 1, ...5 are periodic and the radius of compactification is Ri 
along zth direction, and 


e-2^ = 

Boi = Hf' - 1 

(5.8) 

Hijk = 

i,j,k,l= 6 , ,9 

(5.9) 

r^ = xl + 

The haxmonic functions axe equal to 

h Xg 

(5.10) 

ifi = l + C7i%, 

/3 

^ QstrOL 

(5.11) 

ff5 = l + C5%, 

C5 ~ 9str^ 

(5.12) 

t/’ 1 1 '^9str^ 

K 1-1- Ck ^2 5 “■ 

where V = R2RzRaR5, we displayed the a' dependence to show how the dimensionality 

(5.13) 

of the 


charges appear, but now on we set the slope to unity as usual. Let us briefly note how the charges 
arise in this black hole. There is electric charge Qi coming from Bqi which is a gauge field now, 
after compactification of xi coordinate. The charge is magnetic type originally attributed 
to D5-“brane in 10-dimensions. After compactification the Poincare dual of that 3-form RR field 
strength is two form field strength and it becomes an electric charge counting D-brane charges. Of 
course, the third charge comes from momentum given along Xi direction and is quantized. When 
any one of these charges vanishes, the area of the event horizon vanishes too. The dimensional 
reduction [49] over the periodic coordinates xi,....,2;5, yields the 5-dimensional effective action. 
The metric in the five dimensional space takes the following form 

ds^ = (5.14) 


where n n n 

X = H 1 H 5 K = (1 + C'i^)(l + C'5^)(l + Ck^) (5.15) 

This corresponds to an extremal charged black hole and the horizon is located at r = 0. However, 
the area of the horizon is nonzero and it is proportional to the product of the charges. The 
expression for the area is 


^5 = (r2A'/3)V2|^^^ = ^CiQiCsQ5CKQKi2^) = ^ 


RiV 


(5.16) 


The Newton’s constant in five dimensions gets related to the ten dimensional Newton’s constant 
after we compactify on and the relation is 


^ i2irfRiV 4 RiV 


(5.17) 


Therefore, the entropy is equal to 


Sbii = -^ = 2WQiQ5n (5.18) 

This expression exactly agrees with the expression for Smicroscopic- A few comments are in order 
to discuss the constraints on the parameters for the above relation to be valid. The string effective 
action adopted to obtain the brane solutions is valid when string loop corrections and a' corrections 
are nonleading. The string loop corrections are small when gstr 0 with the values of the charges 



646 


J Mahaxana 


held fixed- The charges correspond to characteristic scales of the system. If we want ignore cx! 
correction terms then the charges should be larger than string scale i.e. Qi, and n are much 
larger than a' . If the compactification radii of the torii be taken as order of string length scale, 
then we should have 1 . This tells us that n >> Qi ~ Qs >> 1- 

The entropy of nonextremal black holes can be considered in a similar manner; however, we must 
keep several points in mind. First of all, the extremal black holes are BPS stated and they get no 
quantum corrections. Therefore, whereas the microscopic entropy is computed in the weak coupling 
phase, the Beckenstein-Hawking entropy is obtained after we go over to the strong coupling domain 
so that the composite D-brane configuration lies inside the horizon. In case of nonextremal black 
holes, we have no theorem against quantum corrections and therefore, passage to strong coupling 
limit is not so simple. It is argued, that a black hole which is slightly away from extremality might 
allow smooth increase of the coupling constant as one starts from weak coupling limit. This type 
of black holes configuration can be achieved by allowing some low level right moving oscillators 
compared to the high left moving levels (note that for extremal case Nr — 0). We shall not discuss 
the properties of these black hole in detail here. 

The BPS extremal black holes are stable and they have zero temperature; therefore, they will not 
emit Haw^king radiation. If we intend to understand the Hawking radiations from black holes in 
string theory, we have to look for those ones which are excited states and can decay into lower 
energy state. The starting point is to consider a nonextremal black hole. Since there will be left 
and right movers, the open string states will be going in opposite directions on the D-string. Again, 
it is a one dimensional problem where one can imagine that two oppositely moving open string 
states collide to give a closed string state. If we were to calculate the S-matrix element for such a 
process, we shall consider initial state, final state and a suitable interaction Hamiltonian for our 
computational purpose. In order to get the emission rate, one will take modulus square of this 
amplitude, average over initial states, sum over final states and divide by usual phase space factor. 
The state of the initial nonextremal black hole is given by occupation numbers Nl and Nr and 
the amount of momentum we give on the compact circle which are going in opposite directions. 
The momenta are quantized as ^ in either direction and thus the closed string state will carry 
momentum As we have seen there are 4 Q 1 Q 5 bosonic and fermionic oscillators. The string 
theory calculation gives the amplitude for emission of a closed string state from these initial state 
[100]. The sum over final state and averaging over initial states leads to a factor PlPr^ where for 
example 

PR-^. (5-19) 

where Ni is the total number of initial states and Nr is the number operator of right movers. We 

might carry out the averaging over all possible initial states with a given value of Nr by adopting 

the statistical mechanical prescription. The problem actually maps to the case of one dimensional 
gas and the microcanonical ensemble can be used since we are holding Nr fixed; energy is held 
constant. The configuration of the black hole is such that Nl » Nr > 1. If jfco is the 
momentum of out going massless closed string the final calculation give the decay rate as 

dr - (Area) (5*20) 

1 “ e ^ 


A more careful calculation [102] reveals a surprising result that not only the form of thermal 
distribution is recovered, but also the numerical coeflScients match with semi-classical results of 
Hawking. The result has been derived for four dimensional black holes as well [103]. It is an 
interesting question to ask whether one can calculate the absorption cross section of an extremal 
black hole for a clmed string massless scalar and then relate that cross section to the decay rate of 
a nonextremal black hole by using the principle of detailed balance in quantum mechanics taking 
Into account all the subteiities. Indeed explicit verification shows that such a check yields the 
correct result [104]. 



Developments In String Theory 


647 


6 M-theory and the M(atrix) model 

Our present understanding of string dynamics together with duality symmetries strengthen the 
belief that there is a fundamental theory and the five perturbatively consistent theories are dif- 
ferent phases of that underlying theory. However, we do not know what this theory is except the 
conjecture that the low energy limit of this theory is the 11-dimensional supergravity action. There 
are deep questions about the structure of this theory. We shall call it the M-theory. We recall 
that strong coupling limit of the type IIA theory is identified with 11-dimensional supergravity. 
When viewed from type IIA perspective, the existence of DO-branes as nonperturbative RR point 
like objects is quite important for our discussion. They are BPS states and their mass is of the 
order and scaled by 10-dimensional length scale These being BPS states, one could assume 
that there are threshold bound states of many, say N, DO-branes which satisfy the properties of 
bound BPS states. Now if we take the strong coupling limit, then it is found that the low energy 
spectrum is same as the spectrum of the 11-dimensional supergravity. This is an important evi- 
dence. Furthermore, the ll-dimensional theory is knowm to admit membrane and five brane and 
we have argued how one can study properties of various brane configurations in string theories 
after compactifying the M-theory. The M(atrix) model [105, 32, 33] can describe perturbation 
expansions of various string theories. There is a limit in which the theory provides connection with 
ll-dimensional supergravity theory. However, one would like to seek answers to several questions 
from this theory. For example, the general prescription for the compactification of the theory is 
not known. Similarly, the complete set of degrees of freedom of this theory is to be obtained. The 
M(atrix) theory, nevertheless, provides insight into nonperturbative definition of string theory and 
it also exhibits string dualities [106]. One can also go over to various string theories by adopting 
different limiting prescriptions. 

The model resorts to infinite momentum frame (IMF) technique boosted along a compact direction. 
The momenta along compact direction is quantized; and one starts with N units of these momenta 
and then N oo limit is taken. Since one is working in the light-cone frame while constructing 
M(atrix) theory, the theory is not manifestly Lorentz invariance. Thus Lorentz invariance might 
be recovered in the large N limit. In the M(atrix) model formulations one encounters parameters 
which have the interpretation of being expectation values of scalars when viewed from the string 
theory side. But in the M(atrix) model when we have IMF formulation, these constant modes have 
infinite frequency and they are frozen into fixed configuration. The theory in its present formula- 
tion is not background independent. Moreover, one encounters problems while compactifying the 
theory on an arbitrary d-dimensional torus. We may remind the reader that the M(atrix) theory 
provides a rich structure to study various aspects of string theory from M-theory stand point. 

The infinite momentum frame (IMF) technique played a very useful role in current algebra [107]. In 
field theoretic calculations it simplifies perturbation theory calculations [108, 109]. When we have 
to deal with a collection of particles, we can define IMF to be a frame where the total momentum 
is taken to be very large. If we designate particles by index I, J... then 

Pi —rjjP Pti (0*1) 

where T stands for ’transverse’ and P • Pti = 0,'^Pti = 0 and rji < 1, For a highly boosted 
coordinate system we could have all rji positive. Particularly, for the case at hand, we deal with 
massive particles and we can choose an appropriate frame to satisfy our requirement. Energy of 
any particle satisfies relativistic relation 

El = y/Pf+m^i = T]iP + 

it is understood that there are terms higher order in ^ denoted by dots. The expression for 
energy is similar to that of a nonrelativistic particle in a lower dimension with mass term taking a 
modified form. When we use a light-cone (LC) frame, a spatial direction is identified and designated 
as longitudinal. The longitudinal momentum is Pu = VfP ^Jid one defines P±i = Ej ± Pu = 
El ± 7 ]iP. The mass shell condition translates to P^iP^j - P^i = rnj and we can rewrite this 



648 


J Maharana 


relation as 

= (6.3) 

In the limit of large P, we have rj/P large and therefore, Ej ->■ 7]iP with Pj^. ~ 277 /P. When 
M-theory is envisaged in IMF, let us designate the momenta as po.pui = 1,*.9 and pn. One 
compactifies 11th direction with and this is also boosted, therefore {pi} are collectively denoted as 
Pt. Thus for collection of the DO particles 



(6.4) 


We note that there are 32 real supercharges in the theory. When one adopts IMF description, it 
is convenient to split them into two groups each having 16 of them. The charges in every group 
transform as spinors of 50(9). Let us denote charges as Qa, a = 1, ...16, and ^ = 1, ..16. The 
algebra of these charges are 


{Qa: Q 13 } = {qa, Qb} - SabPiI (6-5) 

{Qa, 9 a } = lAaPi (6-6) 

Here H is the Hamiltonian operator. P’s are the corresponding momentum operators and ji are 16 
dimensional gamma matrices. 

We have discussed earlier, how DO-brane has a natural interpretation from the 11-dimensional 
theory with a compact coordinate and the HR charge is related to quantized momenta along this 
direction. The relation between mass and charge is satisfied since these are BPS states. There 
exists a sector with N units of DO-brane charge, carrying Kaluza-Klein momentum If we hold 
N fixed and take a limit Pn 0, we go over to the weak coupling phase of string theory; however, 
in the passage to this limit, the string scale is not held fixed. The aim is to study the phenomena 
in the ll-dimensional theory and thus In is to be kept fixed. We recall that 


Rh=9strlll 


(6.7) 


o 

and the string length scale, Thus as the compactication radius tends to zero string scale 

diverges. We have also seen earlier, as the radius shrinks, the mass of DO-brane tends to infinity, 
when measured in ll-dimensional Planck units in ten dimensions. In other words the mass of 
DO-brmie is 


1 _ 1 
9sh Pll 


( 6 . 8 ) 


and therefore, it is appropriate to identify them as the KK modes. Thus, when we consider mass 
of the th^ particles in 10-dimensions, in scales of eleven dimensional theory, the particles become 
very heavy and a nonrelativistic description is quite adequate. If we were to describe M-theory in 
terms of type HA zero branes, then we have a scenario where M-theory is equivalent to iV -> oo 
limit of the nonrelativistic quantum mechanics of N DO-branes which are in weak coupling phase of 
type IIA theory. Furthermore, as Witten has argued [95] the physics of N coincident DO-branes is 
described by dimensionally reducing ten dimensional U{N) supersymmetric Yang-Mills theory to 
O-f 1 dimensions [110]. Let us consider supersymmetric quantum mechanics of a single DO particle. 
The starting point is the action 


/ ~ ^ ^ (6.9) 

This is the actiOB obtained from 10-dimensionaJ super Yang-Mills theory reduced to one dimension. 

i = 1, ...9 stands for transverse directions and 6 are real spinors with 16 components. Since 
X' mid 9 come from the gauge groups, they are in the adjoint representations of U{N). Since they 
carry only time dependence, these axeNxN matrices. Do = dt + [.Ao, ] is the covariant derivative 
and this can be converted to ordinary derivative with the gauge choice = 0. The mass of 



Developments In String Theory 


649 


DO-brane is order thus the first term in (6.9) can be written as J Note that the 

action (6.9) contains parameters of type IIA theory. It is convenient to scale = gstr^^^ which 
amounts to rescaling of the metric to that of 1 1 -dimensional theory. Moreover, one scales the time 
variable as t = gstr^T and denotes the r derivative by a dot. The action is rewritten as 

5 = - iO'^e + + R9'^'ri[Y\e]^ ( 6 . 10 ) 

If Hi = ^ and tt = —iO'^ are conjugate momenta of and 6 respectively, the corresponding 
Hamiltonian is given by 


H = HuTrQn^ - \{[Y\Y^r - 


( 6 . 11 ) 


One can define H = RuH for convenience factoring out over all Rn. Notice also that the potential 
energy term “i?iiTr([T*, T-^])^ is non-negative. When Ru oo, we are in decompactification 
phase of M-theory. Thus, the finite energy states of H are those for which the Hamiltonian H 
has vanishing eigenvalues. One seeks those states for which H\ip) = which is equivalent to 

seeking a solutionffl-^) = ^1^) where e is finite. We know that, for collection of N number of 
DO-branes, the total momentum pn = and therefore, the energy is given hy E = ■^, We 
have to identify e with if we recall (6.4). The N x N matrices X* can be interpreted as the 
location of N DO-branes. When we consider the potential term in variables (6.10), we notice 
that there are fiat directions when [Y^^Y^] = 0. Here we deal with a quantum mechanical system 
and Y^ have are the collective coordinates. In such situation as mutually commuting Y\ we can 
diagonalize Y^ = diag Thus i/n is the zth coordinate of the nth DO-brane. It is easy 

to see that there is invariance under Galilean translation, Y^ and the Galilean boost 

4- vHl as is expected of a nonrelativistic system, here 1 is the unit matrix. The boost 
will affect the center of mass momentum; but neither the relative momenta nor interaction term 
are affected by these transformations. 

We can consider two clusters separated from one another. This is familiar in composite model 
of hadrons where quarks are the basic constituents. In the parton picture, the proton is made of 
large number of partons with very small binding energy and one could describe photon-hadron 
deep inelastic scattering in IMF [111]. In this case we can think of configurations where the 
N X N matrices y ^ can be decomposed to block diagonal form of say n blocks of x iVi , iV 2 x 
iV 2 , ..,.Nn X Nn such that Nm = N. This decomposition can be interpreted as if we have n 
separated clusters of DO-branes where each of the clusters hcis Ni,N 2 ^.-Nn number of particles. 
The distance between two clusters can be defined as 


Tab — 



1 

Nb 


TrYi 


( 6 . 12 ) 


where a and b are the two dusters. Now we can visualize how the potential will arise. It comes 
from Tr([y®, F-’])^ and this goes like modulus squared of the off diagonal block elements multiplied 
by the minimiim of the and an appropriate numerical constant. Thus, if we consider well 
separated cluster of D particles, the off diagonal elements are required to be small; otherwise, the 
potential will grow like r^j. We should keep in mind that the system is supersymmetric and having 
a harmonic oscillator type potential does not imply ground state energy is that of the oscillator. 
The supersymmetric quantum mechanical system has a very rich structure. This becomes trans- 
parent if we consider a single DO-brane, i.e. iV = 1. 




(6.13) 


When we look at this equation from 11-dimensional point of view, this corresponds to the relation 
between energy and momentum of a massless particle in IMF. When we take into account the 



650 


J Maharana 


16 component fermions, 6 we eventually get the supermultiplet with 256 total degrees of freedom 
and this agrees with the massless degrees of freedom of = 1 supergravity in eleven dimensions. 
In fact the bosonic components are 128 equal to fermionic degrees of freedom. As is well known, 
there 44 components from graviton and 84 from the antisymmetric tensor field in 11-dimensions. 
When we have AT > 1, it is necessary to separate the center of mass motion and define the relative 
coordinates and the decomposition is as follows: 

r = y;-hy4i, y,tr. = ^Trr (6.i4) 

Hi = n, i + ^Pcra il, Pcm i = Tr Hi (6.15) 

and Tr Y} = Tr Dr j = 0. Now the total Hamiltonian will be written as a sum of two terms 

H = Hr (6.16) 


with 


Hr, 


= .^2 
2W 


r = ^(Pc. 

Pn 


i? 


(6.17) 


Note the appeaxance of the factor ^ ^ sis expected. We have defined the center of mass coor- 

dinate, canonical momentum and the Hamiltonian by taking trace over U (N) matrices. Therefore, 
the relative Hamiltonian is a function of Hi r}- Thus Hr is quite similar to the original 
Hamiltonian; however, all the variables are SU (N) matrices, they are traceless since the trace part 
is separated out. It has been shown that the relative Hamiltonian has zero energy bound states 
due to the presence of supersymmetry [95, 112, 113]. The total energy is due to the center of 
mass energy: E = Ecm = ^ supergravity multiple which 

has 256 states. Therefore, tor any N, we see that the spectrum contains supergravitons. Suppose 
we decompose to various blocks which describe clusters of D-particles. In the simplest case, if 
the submatrices are exactly block diagonal so that off diagonal elements are zero, then the total 
Hamiltonian will be given by sum of n separate Hamiltonians without any interactions amongst 
them. If we let the off diagonal elements appear (give them small values), that will amount to 
switching on interactions between the clusters. The physical picture is that we have several clus- 
ters, each cluster will have its supergraviton in the spectrum. There could be arbitrary number of 
them and therefore, we let N go to infinity. Thus the matrix model contains the full Fock space of 
supergravitons- The interaction among the supergravitons is described due to the presence of off 
diagonal elements and one should be able to describe various processes involving supergravitons in 
this picture. 

In order to compute S-matrix element for scattering of two supergravitons when their transverse 
velocities are small, we have to determine potential between them. One starts by considering the 
classical configurations and the fluctuations over them to compute the effective action [114]. Sup- 
pose we give transverse velocity v and define the impact parameter as b and expand the coordinates 
around their backgrounds as follows: 


= ^b<j3 + ^5X^ , X^ = ^vtas-\-^SX^ ( 6 . 18 ) 

X^ = yf^5X\ * 5 ^ 8,9 ( 6 . 19 ) 

Here denotes the fluctuations and az is the Pauli matrix. When we have vanishing fluctuation, 
the classical configuration is such that total transverse center of mass momentum and position 
vanish. The 2x2 matrices are block diagonal which describes two clusters of DO-branes and 
^ h ave iVi = iV 2 = 1. Now the separation between the two particles is given by 

+ 6^. The effective action can be computed using the standard techniques and the 
order h term will contain determinant of (basically) propagators when we restrict to one loop 
leveLThus 


Seff^So + J c!rFe//(r(r)) = J drVeff ( -h ) 


( 6 . 20 ) 



Developments In String Theory 


651 


For laxge impact parameter, the long range part of the potential in the leading order is given by 
[114] 

15 

Vejf{r) = -77-7 + higher orders (6-21) 

Id r' 

The result is striking in the sense that this form of the potential can be derived from the super- 
gravity action at the tree level i.e. considering graviton exchange. Thus starting from s simple 
M(atrix) model description, one could extract a result of 11-dimensional supergravity. 

The 11-dimensional supergravity admits supermembrane. It is worthwhile to ask how much the 
M(atrix) model can tell us about the underlying membrane theory. The membrane is extended 
object in two spatial directions as the name suggests. Moreover, the dimension of spacetime in 
which the supermembrane can exist is quite restricted [115, 25]. The reason for such constraints lies 
in the fact that the action contains Wess-Zumino- Witten term and the supersymmetry invariance 
of the full action restricts the spacetime dimensions to 4,5,7 or 11. The membrane is described by 
Z^(cr,^, r), where a, ^ and r are the worldvolume coordinates. When one adopts a Hamiltonian 
formalism, a fixed r-slice is chosen and thus the explicit r dependence in does not appear 
and the derivatives with respect to worldvolume time are traded for canonical momenta V^. The 
light-cone gauge is a convenient description to see the physical degrees of freedom and in this gauge 
the membrane Hamiltonian takes the following form [116] 

~ 2^ / fermionic terms (6.22) 

where the brackets appearing is the second term are defined as 

{A, B} = daAd^B - d^AdaB (6.23) 

and T2 is the membrane tension. Let us assume that the worldvolume of the membrane can be 
written as S x jR, where S has the topology of a torus. For this topology, Z*(cr, is a double periodic 
function and we can expand Z* in double Fourier series with Z^„ as the Fourier coefficients. Thus 
we have nine oo x oo matrices and same would be the case if we had considered nine F^’s in the 
N ^ 00 limit. In order to establish relation with the membrane Hamiltonian (6.22), we have show 
how the commutator [Y\Y^] goes over to the bracket {Z*,Z^}. For arbitrary finite N, introduce 
two N X N matrices U and V, satisfying the properties 

U^ = v^ = l, and UV = e^VU (6.24) 

This can be realized if U and V have the following special form = 1 and Vjj = 

and all other matrix elements set to zero. A more abstract, ’t Hooft, representation is 

O'jr 

= y = (6.25) 

This is the canonical commutation relation between position and momentum when the space is 
taken to be compact and discrete. It is worthwhile to point out that the above commutation 
relation will not hold good for finite dimensional matrices. However, acting on states with low 
wave number, the error on the r.h.s of the commutator [p, q] is further down by power of N and 
therefore, ^ is the leading term. Thus as N assumes higher and higher values, the error gets smaller 
and smaller. From = 1 we can conclude that p and q take eigenvalues where m 

takes values 0,1,2...(A^ - 1). Moreover TrC/”F'" = N6n,o5m,o, where 0 in both the Kronecker delta 
are to be understood as mod N. Now we can expand any N xN matrix in terms of Fourier modes. 

N/2 N/2 

A= (6.26) 

n,m=iV/2-l n,7n=JV/2-l 

Since commutator of p and q is order in the N ^ oo limit, they will commute. The eigenvalues 
of these two operators will fill the interval [0, 27r] and 0 is to be identified with 27r since we have 



652 


J Maharana 


toric geometry. The double Fourier expansion (6.26) takes the form 

oo 

A(p,q)= Y, (6.27) 

rt,m=—oo 

and the Fourier coefficients with the double index are defined as 

" C C ^ S 

Also Tr A = iVAoo, when we take iV -> oo limit, IV A -> N ^_^A{p,q)- One can 

show with some algebra that the commutator of two matrices in the infinite N limit goes over 
to the {, }. Finally bosonic part of the M(atrix) model Lagrangian goes over to a form (identify 
^ = damd^ = d^) 

Lm^^l dad(iZ^icr,0? - ^ / dad^{Z^{a,0,Z^{a,0f (6-29) 

Note that ^ = pn, therefore conjugate momentum of is pnZ^- Thus passage to the Hamil- 
tonian (in light-cone gauge) gives the membrane Hamiltonian (6.22) . This is indeed a remarkable 
result that a simple supersymmetric quantum mechanical system encodes the dynamics of the su- 
permembrane. 

It is natural to ask whether one obtain a string starting from the M(atrix) model. First, one 
compactifies the theory to ten dimension. When the compactification radius is small, the theory 
contains the Fock space of the type IIA string. As the radius tends to zero the string becomes 
free [117] and correct leading order string interactions could be reproduced. In order to carry out 
compactification, we replace the matrices by infinite dimensional operators. The compact coordi- 
nate is represented as - Aa(cr). Here A is a [/(iV) gauge potential. The rest 

of the variables are taken to be matrix valued fonction of a. If we use this ansatz, the resulting 
Hamiltonian is that of maximally supersymmetric 1 -h 1 dimensional Yang-Mills theory. In the limit 
when radius goes to zero and N is taken to be infinity, the moduli space of this model coincides 
with the Fock space of type IIA theory. 

Indeed, the M(atrix) model has opened up new avenues to study dualities between compactified 
model on torus and Yang-Mills theory on dual space. Moreover, there are applications of the 
M(atrix) model to study black holes we refer the interested reader to the review on the subject 
[33]. Another interesting development has been to understand type IIB theory and its dualities 
from a matrix model formulation. In this approach one adopts procedure of Eguchi and Kawai 
to consider reduced 10-dimensional super Yang-Mills theory and it is a theory of N x N matrices 
which even carry no time dependence [118]. We refer the reader to the review article of Makeenko 
[119]. 

7 Anti-de Sitter Space and Boundary field Theory Corre- 
spondence 

Recently, attentions have been focused in constructing supersymmetric gauge theories by consid- 
ering various configurations of branes in string theories as well as in M-theory. When we have N 
coincident Dp-branes, a supersymmetric U(N) gauge theory lives in worldvolume of the branes. 
The expansion proposed by ’t Hooft [121] revealed several aspects of SU{N) Yang-Mills theory. 
According to ’t Hooft, one should consider large N limit of the theory keeping Qym^ fixed, qym 
being the gauge coupling constant. Then a Feynman diagram is designated by the topological fac- 
tor X being the Euler characteristic of the Feynaman diagram. When we consider, expansion 
in rather than in coupling constant, each order in contains diagrams to all orders in coupling 
constant and the leading order corresponds to the planar diagrams. Maldacena [120] has made 
remarkable conjecture regarding large N conformal gauge theories. The proposal states that large 



Developments In String Theory 


653 


N limit of a conformally invariant theory in d dimensions is determined by supergravity theory on 
d -f 1 dimensional Anti-de Sitter space times a compact space (for a sphere it is maximally super- 
symmetric). The AdS/CFT connection has led to the generalization of the holography principle 
in this context [123, 124] which was first introduced in black hole physics [125, 126] in order to 
understand the Beckenstein entropy bound and the area law for black hole entropy. Thus the con- 
jectures of Maldacena led to reveal deeper connections between string theory and superconformal 
gauge theories. 

We have emphasized earlier that gravity is an integral part of string theory since graviton is a 
part of the spectrum. Moreover, gauge fields also invariably appear in string theories. Let us re- 
capitulate a few points in order to get a perspective of AdS/CFT connections. We have seen that 
the heterotic strings, through their constructions, contain nonabelian gauge groups and graviton 
in their massless spectrum. The type II theories have graviton, coming from NS sector, in their 
perturbative spectrum. However, with the discovery of Dp-branes, we know that supersymmetric 
gauge theories can arise if we consider coincident Dp-branes in type II theories. Type I string 
theory admit nonabelian gauge field since Chan-Paton factors can be attached to the end points as 
was discussed earlier. Furthermore, consistency of the theory requires that we have to incorporate 
closed string sector in order to account for nonplanar loop corrections; therefore there is gravity 
coming from the closed string spectrum. For this theory, when we take a' ^ 0 limit Yang-Mills 
theory appears automatically and since consistency requires inclusion of closed string states, grav- 
ity also will appear in the zero slope limit. In view of preceding remarks, one might conclude that, 
in string theory, gravity and gauge theory invariably appear simultaneously. Thus the important 
question to answer is that how the string theory can describe the strong interaction among quarks 
and gluons. The recent developments [120, 122, 123, 127] have provided connections between string 
theory and gauge theories. 

The configuration under consideration is N coincident Dp-branes and open strings can end on 
these hypersurfaces. When we look into the dynamics in the worldvolume we have collection of 
these open strings and their excitations. Moreover, the worldvolume fields have their interactions 
and also there exists interaction with the bulk. An interesting limit to consider is when dilaton 
remains at a fixed value and the slope parameter tends* to zero value. Then, at low energies, the 
gravity decouples; but to keep the interactions in the worldvolume in tact, we should have gauge 
coupling finite, for the U{N) gauge theory. In fact, if we ignore the center of mass part, then we 
need to consider the SU{N) gauge theory. It is necessary to go near the horizon, r 0, to see the 
connection between AdS and CFT. In the near horizon limit, recall eq.(5.11) and eq.(5.12), that 
the factor 1 appearing in the definition of the harmonic function of the Dp-brane can be neglected. 
To be specific let us first consider the metric in the case of N coincident branes. 


ds^ = Hp ^ {r)7}p,jydx^dx^ + H^dyidyi 


(7.1) 


where, {y^} are the transverse coordinates and r = \/ yiy^. The indices p, 2/.... are for tensors on 
the worldvolume. The dilaton and the (p + l)-form potential, coming from the RR sector, are 
given by 


^ Hp{r)^ , and A=[i?p(r)]"^ 


(7.2) 


and 


ifp(r) = l + 


CpN 


(27r\/^)’^ 

{7-p)ns-: 


(7.3) 


Here we have suppressed the indices of the (p + l)“form gauge potential and Ctr = 

is the asymptotic constant value of the dilaton. When we have N coincident D-branes, the 
worldvolume action is the generalised Born-Infeld action proposed by Tseytline [128] 


Sbi = -4°^ j dP+^^e"^STr^-det[G^^ +27rQ:'f'^„] 


(7.4) 



654 


J Maharana 


Here Gf,p is the pullback of the metric Gmn to the world volume and is the gauge field strength 
on the brane. The tension of the brane is 


_ Tp 

2'Ka'gstr Ss 


(7.5) 


and g,tr is the string coupling constant. The action (7.4) under the square root can be expended 
and keeping the second order term in gauge field strength one can write the action in more familiar 
form ^ 


■^gauge 






(7.6) 


where Tr is taken over the gauge group matrices and the gauge coupling constant is identified as 
9 ym — 2p«fr(27r)(P~^^(Q!')^^^- We know from the solutions discussed in previous section (recall 
eq.(5.12) and eq.(5.13)) that, in the limit, when r -4 oo, the metric is flat. Here one is looking 
for the behaviour of the solution in the r -4 0 limit and one chooses a brane for which the dilaton 
is constant at the horizon. If we consider D3-branes, then we find that not only the dilaton is 
independent of r, but also the Yang-Mills coupling constant is dimensionless. As mentioned above, 
one examines the configuration of N coincident branes in the following limit 


r -4 0, 


OL -4 0, and U = — = fixed 
a' 


(7.7) 


Therefore, we can neglect 1 appearing in the harmonic function, and the DS-brane metric goes 
over to 

^ ^ 4- (7.8) 

a' \/4:iirNg,tr 

The last term is the line element of five sphere and the metric describes the manifold AdS x S s 
The radius of AdS is the same as that of Ss and the radius is given by RAdS = 9 str) ^ 

Since the Yang-Mills coupling constant satisfies the relation Qym = the radius of the AdS 

gets related to the Yang-Mills coupling constant as 

= ( 7 - 9 ) 


We know that the worldvolume theory of N coincident Dp-branes is supersymmetric Yang-Mills 
theory in p 4- 1 dimensions and therefore, in this case the iV = 4 SUSY gauge theory will ap- 
pear. This is known to be a conformally invariant theory. From the supergravity side, we could 
describe the theory even for large radius; but that will amount to taking Ngyu large values. 
Maldacena’s conjecture states that strongly coupled N = 4 super Yang-Mills theory is equivalent 
to 10-dimensional supergravity compactified on AdSs x S 5 , However, the consistency of the su- 
pergravity theory requires string theory at a deeper level. Thus supersymmetric four dimensional 
Yang-Mills theory is equivalent to type IIB theory compactified on AdSs x Ss- The relations among 
the parameters are 

Sym = ;^ = 47rp,t,., and (7-10) 

Let us very briefly recall some essential featmres of the Anti-de Sitter space. The Einstein- 
Hilbert action in the presence of cosmological constant term is 


J dPx^/\g\[R -f A] ' (7.11) 

We consider D-dimensional spacetime with Minkowski metric. The field equations are 

^tiv ~ '^QixvR — (7.12) 

Taking the trace of this equation, we can determine curvature scalar R in terms of A, and then 
derive the relation 

R --A-n 

~ 2 — D 


(7.13) 



Developments In String Theory 


655 


In this case the Ricci tensor is proportional to the metric and these are Einstein spaces. This is 
also maximally symmetric space [129] with the property that 

jR 

Rytvpx = iSvXQpp 9yp9p.>) (7.14) 

The example of such space, with nonzero curvature, are de Sitter, Anthde Sitter and D-spheres. 
In this sign convention, AdS space has positive cosmological constant. The AdS space is best 
described by an embedding. We start with D + 1 dimensional pseudo-Euclidean embedding space 

with coordinates {y°‘ == and metric rj = diag +) and the distance 

squared is 

= + (7.15) 

n=l 

Note the appearance of two time coordinates from the form of the metric. The length remains 
invariant under SO{D - 1,2) global transformations 

y" ^ y"* = y- (7.16) 

where L” is an SO{D -1,2) matrix. If we consider the locus of 

— constant (7.17) 

and that defines AdSn- It is worth noting that the invariance group for theories defined on AdSo 
is same as that of the D-dimensional flat .space that is D generators corresponding to translations 
and ^D{D — 1), generators from Lorentz rotations. 

Next, let us consider what is the conformal group in D-dimensional Euclidean space E^. In this 
case the Poincare group has altogether |l>(D'fl) generators (D translations and rest from Lorentz 
group). Then we have following extra generators: 

x-^Xx (7.18) 


this is dilation and A is a real number. Furthermore, there is special conformal transformation 


X 


tp. 



(7.19) 


This transformation involves n parameters The transformation (7.19) can be rewritten as 


x^ + 

1 -f 2x^a^ + 


(7.20) 


Thus we see that the total number of generators are: ^D{D 4- 1) + 1 + D = |(D -h 1){D -f- 2). 
This is the same number of generators that AdSn+i space has. Indeed, in view of the recent 
developments, one can establish the connection that the isometry group of AdS^+i , 50(2, D) acts 
on the boundary as the conformal group acting on Minkowski/Euclidean space. We list below the 
generators of conformal group and their algebra 


Px] = i{9vxPp - 9pxPu) (7.21) 

-^Ap] “ QpX^vp d" 9vX^ pp 9yp^^p^^ (7.22) 

Kx] = iigyxK^ ~ 9pxK^ (7.23) 

[D,P^]=iPf, (7.24) 

[D,K^]-iK^ (7,25) 

[Pp,K.] = 2i{g^u -b (7.26) 



656 


J Maharana 


The generators of conformal transformation have the following representations, when we choose 
Cartesian coordinate system and consider transformation properties of a real scalar field: = 

D = = x^P^ - corresponding to translation, 

Lorentz transformation, dilation and special conformal transformations respectively. 

Let us discuss the evidences in support of Maldacena’s conjecture. When we consider collections 
of D3~branes of the type IIB theory we note that D3-branes couple to the 5-form field strength 
and N units of this flux will pass through the five sphere of the AdS^ x manifold. The isometry 
group of 55 is 50(6) and the AdS^ is endowed with isometry group 50(4,2) as we have just 
mentioned. The IIB theory has fermions and therefore, it is more relevant to consider the covering 
groups 5{7(4) and 5E7(2,2) of 50(6) and 5(4,2) respectively. We also know that type IIB theory 
has 32 Majorana supercharges. These supersymmetries are preserved by the background under 
consideration. The invariance group is the super Lie group SU (2, 2|4) for this theory. On the super 
Yang-Mills part, one has to examine how the above symmetry appears on the boundary theory. 
We have mentioned how the conformal group, for the case at hand, is to be identified as 50(4, 2) 
or SU{2,2), It is well known that iV = 4 super Yang-Mills theory is conformally invariant in four 
dimensions, since the theory has vanishing ^-function [130], and thus the origin of the conformal 
group is well understood. Let us now focus our attention on the other symmetries present in 
type IIB theory. The ten dimensional super Yang-Mills has gauge bosons, A^^jjl = 0,1, ..9, a 
being U{N) group index and thus there are 8 physical states corresponding to each gauge field. 
The superpartners are Majorana Weyl gauginos having matching numbers. The theory has 16 
Majorana supercharges in jD = 10. When we consider the 4-dimensional action, dimensionally 
reduced from ten dimensions [48, 49] physical degrees of freedom of each of the ten dimensional 
gauge field decomposes into 2 (corresponding to physical degrees of freedom of gauge field in 
D == 4) and six scalars, i = 1,2, ..6, a is group index suppressed from now on. The number 
of, gauginos are given by the Weyl spinors, A^, A = 1,2, 3, 4, a = 1,2. One of these fermions, 
together with the gauge field can be grouped define a vector superfield. The rest of the three 
spinors can be grouped with the scalars (which appeared after dimensional reduction) to define 3 
chiral superfields. The 16 supercharges can be grouped into 4 sets of complex Majorana charges 
Qa^Qa^ A = 1,2, 3, 4 and a = 1,2 These two supercharges transform as {4} and {4} of the 
R-symmetry group 517(4). The scalars <l>i transform as {6} of the 50(6), since we deal with the 
covering group 51/(4), the scalars transform in the antisymmetric, rank 2 representation of the 
517(4). We see that type IIB theory has 32 supercharges, but the super Yang-Mills has only 16 of 
them. We know from discussions in Sec.IV that in the presence of the coincident D3-branes, half 
of the supersymmetries axe preserved. When we consider the superconformal algebra the rest also 
appear as the extension of the superconformal group [131]. 

Another important nonperturbative symmetry of type IIB theory is the 5T(2, Z) symmetry where 
dilaton and axion define the moduli. In the Yang-Mills sector the S-duality symmetry is robust 
and is known to be, again, 5L(2,Z). In this case, the modular parameter t = ^ -f whereas 

in the former case it is r = x + 

The preceding discussions were focused to show that the symmetry properties of the type IIB 
theory and those of AT = 4 super Yang-MUls are the same. It is important to investigate which 
physical properties are common to both the theories. Indeed, if the two theories are equivalent, 
it should be possible to identify a physical field ^ in the bulk theory and find the corresponding 
object on the boundary theory. Then, one of the tests will be to compute the correlators involving 
relevant objects in each of the theories and check the consistencies. Thus it is important to identify 
the physical quantities (operators) in both the theories. In the case of the boundary theory, one 
obvious criterion will be to choose gauge invariant operators while computing the correlators. 
One could formally express the equivalence between the theories through the relation among the 
generating functionals. 

The L/i.s. of the above equation is to be identified as the generating function for the supergravity 
theory (rather low energy limit of IIB theory). The action Sjj is determined in terms of the 


I 


'DJ^q-{Sym[.^+0£,[A]J 


( 7 . 27 ) 



Developments In String Theory 


657 


massless states of the supergravity and the Kaiuza-Klein towers and these are collectively denoted 
as Here the coordinates and p taking values 0,1, 2, 3 are to be identified as 

the AdS coordinates and u is the coordinate on five sphere. Moreover, it is implied due to the 
presence of J(x) that it also depends on the boundary data of the bulk fields. The r.h.s. defines the 
generating function for iV = 4 super Yang-Mills theory; however, one only computes the correlation 
functions of gauge invariant composite operators denoted by 0(A) with couplings to J(x). In this 
general setting [122, 123, 132], one will be able to compute the correlation functions from both 
the theories and establish the correspondence between the two theories. Let us consider a simple 
example as illustration for the case of minimally coupled scalar in the bulk theory which could be 
identified with the dilaton. The action on the bulk for the dilaton on AdSs x S 5 is 

3 l 3 r 

5 = ° / (7.28) 

4(jrio J 

The factor comes from the volume of S 5 , through implicit assumption that (p has no depen- 
dence on coordinates of five sphere. The metric is = ^5^^^ is metric on AdSs^ now in the 
Poincare coordinates. For large A >> 1, the classical supergravity can be taken to be a good 
approximation (7.10). The dilaton equation of motion is given by 


d^{V 99 ^^d,<j>) = 0 (7.29) 

Of course, this equation can be solved by the standard Green’s function method. The purpose is 
to determine the generating function with value of dilaton computed on the boundary, call it (j>o 
which is value oi <j) as xq 0. Thus we can write 


(f){xo,x) = J d^zK{xo,x,^(/>oi^ 


(7.30) 


the vectors refer to four dimensional vectors on the boundary space and the Green’s function is 
defined as. 


K{xo,x,z) 


^0 


(7.31) 


dilaton 


Now, one can insert the solution for (p into the action to determine it at the classical value of 

4Gio J Xq 

€ is the cut off for the lower limit of integration. Once expression for (p is inserted into the action, 
then it is possible to take cut off to zero and everything is finite. The action is given by 


5 = 


/ 


~4>do(f>\'i 


(7.32) 


4Gio 




(7.33) 


Then the generating function can be obtained by exponentiating this action. On the super Yang- 
Mills side, since it is a conformal field theory in four dimensions, the quadratic of Yang-Mills field 
strength has dimension 4 and product of two of the terms behave as 

~ (7.34) 

E we want to determine the dilaton correlation function on boundary, we compute 




Now comparing (7.34) and (7.35) we find that they are in agreement. E one considers, metric 
perturbation of the form g^t, = + Kv and computes the two point correlation of this 



658 


J Maharana 


perturbation on the brane taking the boundary limit; this correlation is identical to the correlation 
of stress energy momentum tensors (product of a pair of them; just as we took correlation of two 
terms while identifying the dilaton two point functions). 

Let us recall that the ’t Hooft coupling A = and the length parameter = inl^Ngstr 

are related. If we hold A fixed and let iV oo, then the string coupling tends to zero. Therefore, 
string perturbation theory can give reliable result in this limit. Thus, one can get a full quantum 
theoretic description of the Yang-Mills theory in the iV oo limit. Instead of holding A fixed, if we 

allow it to take large values, then in the domain, where AdS radius is kept constant, the relevant 
limit is a' 0. We know that in the zero slope limit the string theory goes over to supergravity 
theory. We saw the matching of AdS/CFT in this limit. But the consequences of Maldacena 
conjecture is very interesting in this regime, it tells us how the superconformal gauge theory in 
the iV 00 limit behaves in strong coupling domain. Of course, the example we have been 
considering is the one where the /3-function of the theory vanishes identically and therefore, it is 
not a realistic theory if we want to establish connection with supersymmetric gauge theories which 
have running coupling constants leading to asymptotic freedom. There are attempts to construct 
field theories which will have broken SUSY and conformal invariance (for example classical SQCD 
is scale invariant, but in the quantized theory scale invariance is broken). Witten [133]has proposed 
that one should consider AdSr x 54 . The resulting boundary theory corresponds to 6-dimensional 
theory whose action is yet to be explicitly constructed. Then one compactifies the theory on 
and require that fermions satisfy anti-periodic boundary condition around a cycle of the two-torus. 
Then the boundary theory is a four dimensional one. Conformal invariance and supersymmetry 
are broken in this 4-dimensional theory and we have a pure gauge theory with large N. 

There has been rapid developments in studying the interconnection between supergravity (rather 
type IIB) theory on AdS space and boundary gauge theory. Several important issues pertaining 
to string theory and gauge theories have been addressed in this context. We refer to some of the 
recent review articles in this subject [134, 135, 136]. 


8 Sumiriciry and Conclusion 

• 

We have made some efforts to convey to the reader some of the interesting and important devel- 
opments in string theory through this article. It is not possible to include all developments in 
the field in diverse directions in an article of this nature. A global perspective of string theory is 
contained in the article of John Schwarz [137] in this volume. We may recall that the research in 
string theory has stimulated progress in other fields such as mathematics, quantum field theory 
and statistical mechanics of lower dimensional systems to mention a few areas. We have seen that 
string theory has made very important contributions to our understanding of the physics of the 
black holes. As we have mentioned, for a special class of black holes, the Beckenstein-Hawking 
entropy formula could be derived firom an underlying microscopic theory. Similarly, the nature 
of the Hawking radiation firom a stringy black hole, slightly away from extremality, could derived 
from the theory. 

We have noted that, there are intimate connections between the five string theories. Some of them 
are inter related through dualities in ten dimensions and some are related in lower dimensions. 
Thus it is recognized that dualities have a special role in our understanding of string dynamics. 
Moreover, there are increasing evidence that there is a unique, fundamental theory and the five 
perturbatively consistent string theories are various phases of the fundamental theory. It is argued 
that M-theory might be that theory and the low energy effective action of M-theory is to be iden- 
tified with the eleven dimensional supergravity theory. In this context, we discussed the M(atrix) 
model proposal to show that the model captures many important features of M-theory. 

Recently, the conjecture due to Maldacena has attracted considerable attention since it provides 
an important connection between supergravity on the bulk and the supersymmetric gauge theories 
living on the boundary. The connection between type IIB theory on AdSz x and N = 4 super- 
symmetric gauge theory on the boundary has been at the center of attention. Furthermore, there 
are interesting developments in the study of theories on AdSz ^d corresponding two dimensional 



Developments In String Theory 


659 


conformal field theories. 

One of the most important achievements of string theories has been to address important issues 
in quantum gravity and provide answers to some of the puzzles. However, the theory is yet to 
provide a satisfactory answer to the cosmological constant problem. The cosmological constant is 
a parameter in physics w’hich is measured to be closest to zero. It plays a dual role. When we look 
at it from the point of view of macroscopic physics, the smallness of the constant conveys to us that 
the Universe is very large and it is flat. On the other hand, it is expected that, the cosmological 
constant, like other parameters in Nature, should be explained from a microscopic theory and the 
short distance physics, i.e. quantum gravity, will explain the smallness of the cosmological con- 
stant. Therefore, one expects that string theory will be able to resolve this outstanding problem 
[138, 139]. The author along with his collaborators had made an attempt in this direction [140]. It 
is expected that string theory will provide us clues to understand the creation of the Universe and 
the evolution of the Universe in early epochs. Indeed, string cosmology has attracted considerable 
attention is recent years; however, we have not included discussions on this topic in this article 
due to limitations of space. Indeed, string cosmology makes several predictions which might be 
subjected to experimental tests in next few years [141]. 

Acknowledgments 


I would like to thank P. Majumdar, S. Panda, B. Sathiapalan and J. H. Schwarz for their sug- 
gestions and advice. I would like to thank the Yukawa Institute for Theoretical Physics, Professor 
Maskawa, Ninomiya and Sasaki for their very warm hospitality where most of this article was 
written. 


References 

1. G. Veneziano, Nuovo Cimento, 57A, 190(1968) 

2. E- Donini and S. Scuito, Ann. Phys. 58, 388(1970). 

3. D. B. Fairlie and H. B. Nielsen, Nucl. Phys. B20, 637(1970); C. S. Hsue, B. Sakita and M, 
A. Virasoro, Phys. Rev. D2, 2857(1970). 

4. Y. Nambu, ‘QuarkModel and Factorization of Veneziano Amplitude’, in Symmetries and 
quark model, Ed. R. Chand (Gordon and Breach), 1970; H. B. Nielsen, ‘An almost physical 
interpretation of the integrand of the n-point Veneziano amplitude’. Submitted to the 15th 
International Conference on High Energy Physics, (Kiev); L. Susskind, Nuovo Cim. 69A, 
457(1970). 

5- M. A. Virasoro, Phys. Rev. 177, 177(1969). 

6. J. Shapiro, Phys. Lett. 33B, 361(1970). 

7. J. Scherk and J. H. Schwarz, Nucl. Phys. B81, 118(1974). 

8. M. B. Green and J. H. Schwarz, Phys. Lett. B149, 117(1984). 

9. E. Witten, Phys. Lett. B149, 35(1984). 

10. D. J. Gross, J. A. Harvey, E. Martinec and R. Rohm, Phys. Rev. Lett 54, 502(1985); Nucl. 
Phys. 6256,253(1985); Nucl Phys. 6267,75(1986). 

11. M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory, Vol I and Vol II, Cambridge 
University Press, 1987. 



660 


J Mahaxana 


12. J. Polchinski, String Theory, Vol I and Vol II, Cambridge University Press, 1998. 

13. V. Alessandrini, D. Amati, M. Le Bellac and D. I. Olive, Phys. Rep. IC, 269(1971). 

14. J. H. Schwarz, Phys. Rep. 8C, 269(1973). 

15. G. Veneziano, Phys. Rep. 12C, 1(1974). 

16. C. Rebbi, Phys. Rep. 12C, 259(1974). 

17. J. Scherk, Rev. Mod. Phys. 47, 123(1975). 

18. J. H. Schwarz, Phys. Rep. 89C, 223(1982). 

19. Int. J. Mod. Phys.A9, 3707(1994). 

20. A. Giveon, M. Porrati and E. Rabinovici, Phys. Rep. C244, 77(1994). 

21. M. Duff, R. Khuri, and J. Lu, Phys. Rep. 259C, 213(1995). 

22. S. Chaudhury, C. Johnson and J. Polchinski, hep-th/9602052. 

23. J. H. Schwarz, Nucl. Phys. Suppl. B55, 1(1997). 

24. J. Polchinski, Rev. Mod. Phys. 68, 1245(1996). 

25. M. J. Duff, hep-th/9611203. 

26. P. K. Townsend, hep-th/9612121. 

27. M. Douglas, hep-th/9610041. 

28. P. K. Townsend, gr-qc/9707012; hep-th/9712004. 

29. C. Vafa, hep-th/9702201. 

30. E. Kiritsis, hep-th/9708130. 

31. D. Youm, hep-th/9710O46. 

32. T. Banks, hep-th/9710231. 

33. D. Bigatti and L. Susskind, hep-th/9712072. 

34. Y. Nambu, ‘Duality and Biydrodynamics’, Lecture at the Copenhagen Symposium, 1970; T. 
Goto, Prog. Th. Phys. 46, 1560(1971); Y. Hara, Prog. Th. Phys. 46, 1549(1971). 

35. A. M. Polyakov, Phys. Lett. 103B, 207(1981). 

36. S. Deser and B. Zumino, Phys. Lett. 62B, 369(1976); L. Brink, P. Di Vecchia and P. Howe, 
Phys. Lett. 65B, 471(1976); A. M. Polyakov, Phys. Lett. 103B, 211(1981). 

37. S. Pubini, J. Maharana, M. Roncadelli and G. Veneziano, Nucl. Phys. B316, 36(1989). 

38. A. Neveuand J. H. Schwarz, Phys. Rev.bf D4,1109(171), Nucl. Phys. B31, 86(1971). 

39. P. Ramond, Phys. Rev.D3, 2415(1971). 

40. F. Ghozzi, J. Scherk and D. Olive, Phys. Lett. 65, 282(1976); Nucl. Phys. 8122,253(1977) 

41. C. Bachas, Lectures on D-branes, hep-th/9806199. 

42. A. Sen, Int. J. Mod. Phys. A9, 3707(1994). 

43. S. Fubini and G. Veneziano, Nuovo Cimento, 67A, 29(1970). 



Developments In String Theory 


661 


44. L. Alvarez Gaume, D. Z. Freedman and S. Mukhi, Ann. Phys. 134, 85(1981). 

45. M. B. Green, J. H. Schwarz, L. Brink, Nucl. Phys. B198, 474(1982); K. Kikkawa and M. 
Yamazaki, Phys. Lett. 149B, 357(1984); N. Sakai and I. Senda, Prog. Th. Phys. 75, 
692(1984); V. P. Nair, A. Shapere, A. Strominger and F. Wilczek, NucL Phys. B287, 
402(1987). 

46. A. Shapere and F. Wilczek, Nucl. Phys. B320, 669(189); A. Giveon, E. Rabinovici and G. 
Veneziano, Nucl. Phys. B322, 167(1989); A. Giveon, N. Malkin, E. Rabinovici, Phys. Lett. 
220B, 551(1989) 

47. K. S. Narain, Phys. Lett. 169B, 41(1986) 

48. J. Scherk and J. H. Schwarz, Nucl. Phys. B153, 61(1979). 

49. J. Maharana and J. H. Schwarz, Nucl. Phys. B390, 3(1993). 

50. S. F. Hassan and A. Sen, Nucl. Phys. B375, 103(1992). 

51. G. Veneziano, Phys. Lett. 265B, 287(1991). 

52. K. Meissner and G. Veneziano, Phys. Lett. 267B, 33(1991). 

53. K. Meissner and G. Veneziano, Mod. Phys. Lett.A6, 3397(1991) 

54. M. Gasperini, J. Maharana and G. Veneziano, Phys. Lett. 272B, 277(1992); Phys. Lett. 
296B , 51(1993). 

55. A. Sen, Phys. Lett. 271B, 295(1991); Phys. Lett. 272B, 34(1992); Phys. Rev. Lett. 69 , 
1006(1992). 

56. G. ’t Hooift, Nucl. Phys. B79,276(1974). 

57. A. M. Polyakov, JETP Lett. 20, 194(1974). 

58. C.Montonen and D. Olive, Phys. Lett. 72B, 117(1977). 

59. E. Witten, Phys. Lett. 86B, 283(1979). 

60. E. Witten and D. Olive, Phys. Lett. 78B, 97(1978). 

61. A. Font, L. Ibanez, D. Lust and F, Quevedo, Phys. Lett. B249, 35(1990). A. Shapere, S. 
Trivedi and F. Wilczek, Mod. Phys. Lett. A6, 2677(1991). 

62. S. J. Rey, Phys. Rev. D43, 526(1991). 

63. J. H. Schwarz and A. Sen, Phys. Lett. B312, 105(1993); Nucl. Phys. B411, 35(1994). 

64. J. H. Schwarz, Dilaton-Axion Symmetry, Talk at the International Workshop on String The- 
ory, Quantum Gravity and Unification of Fundamental Interactions, Rome, September 1992; 
hep-th/9209125. 

65. A. Sen, Phys. Lett. 329B, 217(1994). 

66. N. Seiberg and E. Witten, Nucl. Phys. B426, 19(1995). 

67. A. DabhoUcar, G. Gibbons, J. A. Harvey and F. Ruiz Ruiz, Nucl. Phys. B340, 33(1990); A. 
Dabholkax and J. A. Harvey, Phys. Rev. Lett. 63, 478(1989). 

68. G. T. Horowitz and A. Strominger, Nucl. Phys. B360, 197(1991). 

69. J. Polchinski, Phys. Rev. Lett. 75 , 4724 1996. 



662 


J Maharana 


70. M. Duff, Nucl. Phys. B442, 47(1995). 

71. C. Hull and P .K. Townsend, Nucl. Phys. B438, 109(1995). 

72. M. Duff, Nucl. Phys. B442, 47(1995); M. Duff and R. Khuri, Nucl. Phys. 411, 473(1994). 

73. E. Witten, Nucl. Phys. B443, 85(1995). 

74. A. Sen, Nucl. Phys.B450, 103(1995). 

75. J. A. Harvey and A. Strominger, Nucl. Phys. 449, 535(1995) 

76. M. Dine, P. Huet and N. Seiberg, Nucl. Phys. B322, 301(1989); J. Dai, R. G. Leigh and J. 
Polchinski, Mod. Phys. Lett. A4, 2073(1989). 

77. F. Giani and M. Pernici, Phys. Rev. D30, 325(1984); I. Campbell and P. West, Nucl. Phys. 
B243, 112(1984); M. Huq and M. Namaaie, Class. Quant. Grav. 2, 293(1985). 

78. P. K. Townsend, Phys. Lett. 350B, 184(1995). 

79. M. J. Duff, P. S. Howe, T. Inami and K. S. SteUe, Phys. Lett. 191B, 70(1987); M. J. Duff 
and K. SteUe, Phys. Lett. 253B, 113(1991). 

80. J. H. Schwarz, Phys. Lett. 367B, 97(1996). 

81. J. H. Schwarz, Phys. Lett. 360B, 13(1995). J. Polchinski and E. Witten, Nucl. Phys. B 
460, 525 (1996); J. Polchinski, Rev. Mod. Phys. 68, 1245(1996); M. J. Duff, Int. J. Mod. 
Phys. A 11, 5623 (1996). 

82. C. HuU and P. Townsend, Nucl. Phys. B 438, 109 (1995). 

83. E. Witten, Nucl. Phys. B 443, 85 (1995). 

84. P. Horava and E. Witten, Nucl. Phys. B460, 506(1996); Nucl. Phys. B475, 94(1996). 

85. K. Dasgupta and S. Mukhi, Nucl. Phys. 465, 399(1996). 

86. E. Witten, Nucl. Phys. B463, 383(1996). 

87. A. Sen, Mod. Phys. Lett. All, 1339(1996). 

88. J. Bardeen, B. Carter and S. W. Hawking, Comm. Math. Phys. 31, 161(1973). 

89. J. Beckenstein, Lett. Nuov. Cimento 4, 737(1972); Phys. Rev. D7, 2333(1973); Phys. Rev. 
D9, 3292(1974). 

90. S. W. Hawking, Nature 248, 30(1974); Commun. Math. Phys. 43, 199(1975). 

91. L. Sussldnd amd J. Uglam, Phys. Rev. D50, 2700(1994); J. Russo and L. Susskind, Nucl. 
Phys. B437, 611(1997). 

92. S. Fubini and G. Veneziano, Nuovo Cimento 64A, 811(1970) 

93. A. Sen, Mod. Phys. Lett. AlO, 2081(1995). 

94. A. Strominger and C. Vafa, Phys. Lett. 379B, 99(1996). 

95. E. Witten, Nucl. Phys. B460, 335(1996). 

96. A. Sen, Phys. Rev. D54, 2964(1996); Phys. Rev. D53, 2874(1996). 

97. C. Vafa, Nucl. Phys. B463, 415(1996). 

98. C. Vafa, Nucl. Phys. B463, 435(1996). 



Developments In String Theory 


663 


99. S. R. Das and S. D. Mathur, hep-th/9601152. 

100. C. G. Callan and J. M. Maldacena, Nucl. Phys. B472, 591(1996). 

101. J. M. Maldacena, Black holes in string theory, hep-th/9607235; this Princeton University 
thesis has a comprehensive presentation of black hole entropy and Hawking radiation derived 
from string theory. 

102. S. R. Das and S. D. Mathur, Nucl. Phys. 478, 561(1996); Nucl. Phys. B482, 153(1996). 

103. S. Gubser and L Klebanov, Nucl. Phys. B482, 173(1996); Phys. Rev. Lett. 77, 4491(1996). 

104. A. Dhar, G. Mandal and S. R. Wadia, Phys. Lett. 388B, 51(1996). 

105. T. Banks, W. Fischler, S. H. Shenker and L. Susskind, Phys. Rev. D55, 112(1997) 

106. A. Bilal, M(atrix) theory: a pedagogical introduction, hep-th/ 9710136. 

107. S. Pubini and G. Furlan, Physics, 1, 229(1965); S. L. Adler, Phys. Rev. Lett. 14, 1051(1965). 

108. S. Weinberg, Phys. Rev. 150, 1313(1966). 

109. J. Kogut and L. Susskind, Phys. Rep. 8C, 75(1973). 

110. U. H. Danielsson, G. Ferrari and B. Sundborg, Int. J. Phys. All, 5463(1996); D. Rabat 
and P. Pouliot, Phys. Rev. Lett. 77, 1004(1996). 

111. R. P. Feynman, Photon Hadron Collisions, Benjamin, 1973. 

112. S. Sethi and M. Stern, Commun. Math. Phys. 194, 675(1998) 

113. M. Porrati and A. Rozenberg, Nucl. Phys. B515, 184(1998). 

114. K. Becker and M. Becker, Nucl, Phys. B506, 48(1997). 

115. A. Achucarro, J. M. Evans, P. K. Townsend and D. L, Wiltshire, Phys. Lett. 198B, 
441(1987); 

116. B. de Wit, M. Liischer and H. Nicolai, Nucl. Phys. B305 [FS23], 545(1988). 

117. T. Banks and N. Seiberg, Nucl. Phys. B497, 41(1997); R. Dijkgraaf, E. Verlinde and H. 
Verlinde, Nucl. Phys. B500, 43(1997). 

118. N. Ishibashi, H. Kawai, Y. Kitazawa and A. Tsuchiya, Nucl. Phys. B498, 467(1997) 

119. Y. Makeenko, Three Introductory Lectures in Helisinki on Matrix Models of Superstrings, 
hep-th/9704075. 

120. J. Maldacena, Adv. Theor. Math. Phys. 2, 231(1998). 

121. G. ’t Hooft, Nucl. Phys. B72, 461(1974). 

122. S. S. Gubser, I. Klebanov and A. M. Polyakov, Phys. Lett. 428B, 105(1998). 

123. E. Witten, Adv. Theor. Math. Phys, 2, 253(1998). 

124. E. Witten and L. Susskind, The Holographic Bound in Anti-de Sitter Space, hep-th/9805114. 

125. C. R. Stephens, G. ’t Hooft and B. F. Whiting, Class. Quant. Grav. 11, 621(1994); G. 't 
Hooft gr-qc/9310026. 

126. L. Susskind, J. Math. Phys. 36, 6377(1995). 

127. A. M. Polyakov, Nucl. Phys. B68, 1 (1998); Proc. Suppl. 



664 


J Mabarana 


128. A. Tseytline, NucL Phys. B501,41 (1997). 

129. S. Weinberg, Gravitation and Cosmology, 

130. S. Mandelstam, NncL Phys. B213, 149(1983). 

131. R. Haag, J. T. Lopuszanski and M. Sinius, Nucl. Phys. B88, 257(1975). 

132. F. Ferrara, C. Fronsdaland A. Zaffaroni,Nucl. Phys. B532, 153(1998). 

133. E. Witten, Adv. Theor. Math. Phys. 2, 505(1998). 

134. J. L. Petersen, Introduction to the Maldacena Conjecture on AdS/CFT, hep-th/9902131. 

135. P. Di Vecchia, An Introduction to AdS/CFT equivalence, hep-th/9903007. 

136. M. B. Green, Interconnections between type II superstrings, M theory and jV = 4 supersym- 
metric Yang-Mills, hep-th/9903124. 

137. J. H. Schwarz, in this Volume. 

138. E. Witten, Mod. Phys. Lett. AlO, 2153(1995). 

139. K. Becker, M. Becker and A. Strominger, Phys. Rev. D51, 6603(1995). 

140. S. Kar, J. Maharana and H. Singh, Phys. Lett. B374, 43(1996). 

141. G. Veneziano, CERN Preprint, CERN-TH/98-43, hep-th/9802057. 



25. Yang-Mills Theory and Matrix String Theory 

L. Bonora * 

International School for Advanced Studies (SISSA/ISAS) 

Via Beirut 2-4, 34014 Trieste, Italy, and INFN, Sezione di Trieste 


Abstract 

This is a review of some recent developments in the study of the relation between Yang- 
Mills theory and strings. The relation we are concerned with here is based on classical Yang- 
Mills theory solutions called Riemannian instantons. The latter are two-dimensional solutions 
describing, in the strong coupling, Riemann surfaces. They lend therefore themselves to an 
interpretation in terms of string theory interaction. This interpretation is worked out in detail 
for the so-called Matrix String Theory, i.e. for the 2d Yang-Mills theory with A/* = (8, 8) 
supersymmetry obtained as reduction of the A/* = 1 Yang-Mills theory with gauge group 
U(N) in lOd. In fact it is argued that, in the strong coupling limit, the Matrix String Theory 
describes type IIA superstring theory. 


1 Introduction 

The rich structure of non- Abelian gauge theories is at the core of their successful employment as 
theories of the elementary particles. It is this complicated structure that allows us to claim to 
be able, at least potentially, to describe in a consistent way by the same theory both confinement 
and asymptotic freedom, quark, gluons, mesons and hadrons. Classical and quantum non- Abelian 
gauge theories have been analyzed in countless papers in the past, but they do not cease to surprise 
us by revealing from time to time new structure. This has happened also recently in connection 
with the so-called second string revolution. Gauge theories have been naturally associated with 
branes and this association has revealed new, previously unsuspected features. This review is 
devoted to one such new development: the connection between Yang-Mills theories and strings via 
classical configurations called Riemannian (stringy) instantons. 

Actually the connection between Yang-Mills theories and strings is far from new and seems 
to be multiform. That a link should exist was recognized long ago by T Hooft , [1]. In this case 
Riemann surfaces appeared as an auxiliary structure underlying the relevant Feynman diagrams 
and led to the 1/N expansion. This gave rise to a vast literature, see [3], especially in 2d, where 
the connection with strings became more concrete: see for example QCD 2 , analyzed for its string- 
like properties [2]. More recently another connection was found between conformal SYM and IIB 
supergravity /superstring theory at large N in the AdS geometrical framework, [4]. Of a similar 
nature is the link between non-supersymmetric Yang-Mills theories and type 0 strings, [6], as well 
as the non-critical string approach, pioneered by Polyakov, [5]. All these are duality relations, i.e. 
they relate a Yang-Mills theory in a given range of the coupling to a string theory in a related 
range of the string coupling. 

In this review we deal with a more direct link between string theory and non-Abelian Yang- 
Mills theory: not a duality relation between strings and Yang-Mills theories, but the emergence in 
the latter of classical solutions modeled over Riemann surfaces, which naturally lend themselves 
to a string interpretation. Such configurations exist in any non-Abelian Yang-Mills theory in 
dimensions 4 or higher, and even in 2d when adjoint matter is present. As will be seen, in some 
instances the string interpretation is clear. In other instances further analysis is needed. 


Email: honor a@sissa 



666 


L Bonora 


This string-like nature of Yang-Mills theories is something that could have been found long ago, 
but in fact the first example was brought to light only after the proposal of Matrix Theory [9]. The 
latter, in the large N limit, is expected to describe M theory. Therefore upon compactifying it on a 
circle one should end up, in the appropriate limit, with type IIA superstring theory [10, 11, 12, 13] 
Now, by compactifying Matrix Theory on a circle we obtain (see below) Af = (8, 8) super-Yang- 
Mills (SYM) on a cylindrical 2D space-time with gauge group U(N). Therefore the conjecture, 
supported with various arguments [10, 11, 13], is that this theory represents in the strong Yang- 
Mills coupling limit a theory of type II superstrings (see also [14, 15, 16]). Hereafter we refer to 
this theory as Matrix String Theory (MST). 

Let us briefly review how decisive evidence in favor of this identification has been found re- 
cently. A first step in this direction was made in refs. [17, 18, 19], where it was pointed out 
that MST contains BPS instanton solutions which interpolate between different initial and final 
string configurations via suitable punctured Riemann surfaces. We often refer to them as stringy 
or Riemannian instantons. Subsequently , [20], it was shown that, in the strong coupling limit, 
MST in the background of a given classical BPS instanton solution reduces to the Green-Schwarz 
superstring theory plus a decoupled Maxwell theory, and that the leading term of the amplitude in 
such background is proportional to ^7^, where Qs is the string coupling constant (i.e. the inverse 
of the Yang-Mills coupling) and ^ = 2 - 2A - n is the Euler characteristic of the Riemann surface 
of genus h with n punctures, which characterizes the given classical solution. This is the result one 
expects from perturbative string interaction theory. Needless to say this is a strong confirmation of 
the abovementioned conjecture. Similar results have been seen to hold, [22], also for the Heterotic 
Matrix String Theory (HMST), i.e. for Af = (8,0) SYM in 2d with gauge group 0(N): in the 
strong Yang-Mills coupling limit one finds the heterotic superstring theory with suitably broken 
gauge group. 

This is not the end of the story, as far as the identification of strong coupling SYM and su- 
perstring theories is concerned. One cannot limit oneself to a qualitative correspondence between 
the two types of theories. In the path integral we have to integrate over all possible Riemannian 
instantons (modulo symmetries), therefore the question of instanton moduli is of paramount im- 
portance. In fact a few precise things can be said in this regard. In [21], the study of the moduli 
space of Riemannian instantons was taken up. Any such instanton consists of two ingredients, a 
group theoretical factor and a core. The latter corresponds to a branched covering of the cylinder. 
The group theoretical factor contains fields that satisfy WZNW-like equations with delta-function 
sources. Inside these instantons, Riemann surfaces appear as branched coverings of the base cylin- 
der in the form oi plane curves^ i.e. the zero locus of order N polynomials of two complex variables. 
One can then set out to study the moduli space of such curves. This problem is of utmost impor- 
tance because it turns out to be connected with the discrete parameter N and, therefore, it affects 
the large N limit. As we will see, for finite iV, the instantons of MST reproduce exactly only tree 
string amplitudes, while they cover only part of the moduli space of higher genus Riemann surfaces 
with punctures- More precisely, in a process with n external string states mediated by a Riemann 
surface of genus A, one expects 3A - 3 4- n complex moduli; at finite N, in MST, h of them are 
discrete. In [21] it was argued that, when N — >■ oo, these discrete moduli become continuous and 
one can recover the full moduli space of Riemann surfaces only. 

After this historical reconstruction a clarification is in order. A thorough discussion of the 
relation between Yang-Mills theories and strings will be carried out here for SYM theories in 2d 
with a large amoimt of supersymmetry. However Riemannian instantons exist also with less or no 
supersymmetry at all in Yang-Mills theories in 4 or higher dimensions. Although in these cases 
the path integral impact of such configurations is not yet clear and, anyway, more complicated to 
evaluate, they represent an intriguing and not yet studied aspect of Yang-Mills theories. To stress 
this point we shall start in section 2 with the example of Riemannian instantons in 4d Yang-Mills 
theory. 

The purpose of this review is to provide a pedagogical introduction and synoptic view of the 
above problems and results, which otherwise can be found scattered in various papers. The paper is 
organized as follows. In the next section we discuss a simple but significant example of Riemannian 



Y-M Theory and Matrix String Theory 


667 


instanton in Yang-Mills theory. The next section is a very sketchy introduction to type IIA theory, 
M theory and Matrix theory. Section 4 contains a formulation of MST. In section 5 it is explained 
how how to construct the most general Riemannian instanton in MST.’^ In section 6 it is shown 
that MST in the background of a given classical BPS instanton, characterized by a given Riemann 
surface, reduces to the Green-Schwarz superstring theory plus a decoupled Maxwell theory, and 
that the leading term of the amplitude in such background is proportional to x being the 
Euler characteristic of the Riemann surface in question. Section 7 is devoted to plane curves with 
a particular attention to singularities and their meaning. Finally in section 8 we concentrate on 
the moduli space of stringy instantons and describe the above-mentioned discretization. Two 
Appendices are devoted to the most technical aspects of this review. 


2 Riemannian instantons in Yang-Mills theories 


Let US consider, as an example, a pure Euclidean Yang-Mills theory in 4d with gauge group U{N). 
Let A be the gauge connection form with curvature F. We use the hermitean convention for the Lie 
algebra-valued matrices so that the covariant derivative is D = d+igA^ where g is the Yang-Mills 
coupling. We concentrate on the seif-duality condition for the Yang-Mills field strenght 


Rewritten in complex coordinates w = y = x^+ix^^^y = — it becomes 


Fyjiij Fyjy — 0 — Fyjy ( 1 ) 

To conform with the convention of MST, which will be used in the following sections, we suppose 
that the coordinate span an infinite cylinder (precisely x^ is the periodic coordinate). We 

want to single out solutions of these equations which are independent of t/, y. Then, introducing 
the notation X ^ Ay ~ A^ — X^ ^ (1) becomes 

Fy,^-ig\X,X]^Q 

D^X = 0 , Di,X = 0 ( 2 ) 


We refer to these equations as Riemannian instanton equations or, simply, instanton equations. 

As an introductory example let us consider the case in which the gauge group is 17(2) (iV = 2). 
We look for a couple (A,X) that satisfies eqs.(2). To this end we choose the following ansatz 

X = Y-'MY, (3) 


where Y is a suitable matrix E 5L(2,C), and M is the following 2x2 matrix 


M = 



( 4 ) 


where a is a function of the point in the w cylinder. As a consequence of the equation D^jX = 0, 
it follows that d^a = 0, i.e. a is holomorphic in w (except perhaps at infinity on the cylinder). 
Now, given such a holomorphic a we want to find Y so that (2) is satisfied. We parametrize Y as 
follows: 


Y = 






where u is a function to be determined and p — ^ + ^Zn|a|. Then using (3) we find 


( 5 ) 




668 


L Bonora 


Now it is easy to verify that the first equation in ( 2 ) implies 


(7) 


Inserting the explicit form of p and the change of variable in C> s.t. one can rewrite 

( 7 ) as 


TT 

di;d^u - 25^sinh u = -~S(a)(d(a)(d^a) . 


( 8 ) 


where the derivatives are understood in the sense of complex distribution theory. This is the sinh- 
Gordon equation with delta-function-like boundary conditions at the points where a vanishes. If 
ti is a smooth solution of this equation, the couple (X,A) is a solution of ( 2 ) which is smooth 
everywhere except perhaps at infinity on the cylinder. As we will see later such solutions exist, 
therefore solutions of the instanton equations do exist. However before we go to more details on 
the existence of solutions, it is important to discuss their meaning. 

There are two distinct ingredients in ( 3 ); one is the matrix M (the core) and the other is the 
group theoretical element Y. We discuss them in turn. 


2.1 Branched coverings 

The matrix M represents a branched covering of the cylinder spanned by the coordinate w. In 
order to see this we diagonalize it by means of a matrix in 5 T( 2 ,Q: 


M = SMS-\ 




c i f ai 

* = 7 ! U-i 



(9) 


It is convenient now to pass to a new coordinate 2: = e^, which maps the cylinder into the complex 
2-plane with two punctures at 2 = 0 and 2 = 00. The two eigenvalues of M (or, equivalently, of 
X)y which are the two roots of the algebraic equation X‘^ = a, can be thought of as the sheets of 
a double covering of the cylinder. Each sheet is a copy of the complex 2-plane. Each point on 
each sheet project to the corresponding point on the 2-plane. Such projection will be denoted tt. 
Suppose, for simplicity, that a = 2 - 20. Then we have a branch point at 2 = 20 and another at 
2 = 00. We can draw a cut between these two branch points (for definiteness the cut will lie in the 
region \z\ > | 2 ol). The two sheets are connected through the cut. What we mean by this, as is well 
known, is that if we consider going around the origin in the 2-plane along a small circle of radius 
< |2o|, we produce an inverse image (under tt) on the covering formed by two small circles around 
the origin, one for each sheet. But if we do the same operation for a circle with radius > [20 1 , we 
are bound to cross the cut: crossing the cut multiplies the root by a phase so that y/a goes 
over to -y/a and viceversa, i.e. by crossing the cut we pass from one sheet to the other. This 
means that the counterimage of the circle on the covering this time is a long circle that extends 
over both sheets. 

This result can be interpreted in two ways: a geometrical and a string theoretical one (although 
the latter is a bit premature at this stage). Geometrically what we have just described is a 
Riemann surface reprinted by a branched covering of the complex 2 plane. The Riemann surface 
in question has genus 0 and three punctures (see fig. 1 , in all the figures of this paper Riemann 
surfaces are represented with finite boundaries in order to better suggest the string interpretation; 
however these boundaries are all asymptotic, they correspond to 2 = 0 or 2 = 00, so it is more 
appropriate to think of them as punctures). 

The string interpretation is rather obvious. The coordinate xi is taken to be the Euclidean 
time. Therefore 2 = 0 correspond to time —00 and 2 = 00 to time +00. It is natural to interpret 
the oounterimages of the small circle around the origin as two incoming strings. This configuration 
does not change if we enlarge the circle around 2 = 0 as long as its radius remains smaller than |2o|. 
We say that the two strings propagate in time without interacting. If the radius of the circle in the 



Y-M Theory and Matrix String Theory 


669 



Figure 1: Simple branch 


z plane becomes larger than |; 2 :o |5 the counterimage becomes a unique closed curve extending over 
both sheets, and this configuration propagates unchanged as far as oo. Therefore the instanton 
we are considering represents the joining of two strings to form one long string of double length. 
The joining interaction takes place at z = zq. The inverse image under tt of z = 0 and z = oo 
correspond to the points of the Riemann surface where the incoming strings enter and the outgoing 
string exits, respectively. 

If there are more branch points beside the one considered above, it is not difficult to see how 
they may give rise to more complicated Riemann surfaces with handles (and more complicated 
string interactions). We will return later on to the problem of constructing more complicated 
Riemann surfaces and justifying the string interpretation outlined above. For the time being we 
would like to complete the description of our simple example by digging out firom it any useful 
information. We notice that the passage through the cut can be described mathematically as the 
monodromy transformation 

^), A=(; J) (10) 

M is monodromy invariant since simultaneously S 

Another useful remark is the following: X can be rewritten as follows 

X = Y-^MY = L-^UMU-^^L (11) 

where 



It is a remarkable (and intentionally looked for) fact that U is a unitary matrix. U has the same 
monodromy as S: U 



670 


L Bonora 



Figure 2: Numerical solutions of eq. (8) for (a) small and (6) large g, respectively. 

2.2 The group theoretical factor 

Let us now pass to the analysis of the group theoretical factor Y = KL. We have just seen that 
K concurs with the diagonalizing matrix 5 to form a unitary matrix 17. We notice however that 
K diverges at z = zq. Let us analyze next the dressing factor L. This amounts to returning to 
the solutions of eq.(8) . The problem is to find a solution u that satisfies the sinh-Gordon equation 
with the boundary condition 


u ^ 



a ^ 0 . 


(13) 


It is thanks to this logarithmic singularity that the factors and K, contained in y , compensate 
for the the singularities of each other, so that the resulting solution (X, A) is smooth. Let us suppose 
again, for simplicity, that z — zq. We want to single out, for this simple case, solutions of (8) 
with the right asymptotic behavior, that is behaving like (13) at z = zq and vanishing at z = 0, oo. 
First we rescale ( V2pC> so that the sinh-Gordon equation takes the standard form 


d(;d^u — sinh u = 0 . (14) 

We do not know an exact solution of this equation satisfying the boundary condition (13) and 
vanishing at the origin and at infinity of the jzr-plane. We can therefore proceed in two ways: 
find a numerical solution or an approximate analytical one. The latter, being rather technical, is 
discussed in Appendix A. The conclusion there is that the solution u, in the strong coupling limit, 
shrinks around the branch point z = zq^ that is it is practically zero everywhere except in a small 
neighborhood of this pK>mt where it has a spike-like behaviour. 

This is a>nfirmed by the numerical solutions shown in fig. 2. 

We can easily extend the previous analysis to the case in which a contains several distinct 
zeroes. In the following we will simply assume that this is always the case, namely that in the 
strong coupling limit the dressing factor L tends very rapidly to 1 outside the branch points of the 
covering surface. 


2.3 Summary 

The purpc^ of this section was twofold: on the one hand, to show that in 4d Yang-Mills theory 
we have Rietnannian instantons; on the other hand to work out an explicit simple example of such 
mstantons. As for the first point it will become clear that Riemannian instantons, characterized 
by e<p.(2), exist in any Yang-Mills theory in dimensions > 4 and also in 2d provided there is 
adjoint matter in the theory. The explicit solutions for the instantons is the same in all cases, but 
their role in the path integral may differ from case to case. Prom the example we have worked 
out dbove we learn how to proceed in order to find the general solution. A few things should be 
retained because they will turn out to hold in general: a Riemannian instanton is made of two 


Y-M Theory and Matrix String Theory 


671 


ingredients, the core constituted by a matrix M which defines a branched covering representing a 
Riemann surface with punctures, and a group theoretical factor. The latter splits into two factors: 
a dressing factor which tends to 1 in the strong coupling limit and the remnant, which concurs to 
form a unitary matrix when applied to M. 

Our next step is to see the astonishing consequences of this fact at work in MST. Before going 
into that however, we have to prepare the ground with a short summary of purely pedagogical 
character. 

3 IIA Superstring Theory and Matrix Theory 

This section is a short review of type IIA superstring theory, M theory and Matrix theory, whose 
main purpose is to facilitate the comprehension of MST by inserting it in the historical framework 
in which it was first formulated. 

Type IIA superstring theory can be formulated either in the Neveu-Schwarz-Ramond or in the 
Green-Schwarz (GS) formalism. For comparison with MST, the latter, in the light-cone gauge 
[7, 8], is the relevant one. In the light-cone gauge, out of ten fields {a, r) {M = 0, . . . , 9), two are 
eliminated: defining x^ = ±x^, one sets x"*" r to completely fix the gauge, while x~ turns out 

to be completely fixed in terms of the transverse degrees of freedom. So the true degrees of freedom 
of the theory are the transverse ones, x^ with i = To make a supersymmetric theory we 

have to add the fermionic part, which consists of two sets of 2D Majorana-Weyl fermions: and 

'dc- They transform according to the 8s and 8c representation of 50(8), respectively. The GS 
light-cone action is 

S = -^J dadT , ( 15 ) 

where the integration range is -oo <t<oo, 0<cr<7r. In (15) fu,!/ = 0,1, and the 2D flat 
Minkowski metric rifj,^ is taken to have signature (—,+). 

Summation over the i index is understood. Moreover we have assembled and in a doublet 
'd so that = (^?s,^?c) and § = p^. Finally p^ are the 2D gamma matrices. 

The GS action (15) is invariant under two sets of rigid supersymmetry transformations, each 
characterized by 8 parameters. The massless spectrum of the GS superstring is the same as the 
spectrum of IIA supergravity in lOD, which constitutes its low energy effective theory. The bosonic 
massless fields include the dilaton 0, the graviton qmn and the antisymmetric tensor fields Bmn 
(the NSNS sector) together with the one-form Am and the three-form Clmn fields (the RR sector). 
Type IIA theory contains other brane-like objects, beside strings. They appear as macroscopic 
(soliton-like) solutions of IIA supergravity and their world-sheet couples to one of the potentials 
in the IIA supergravity theory. Therefore we have DO-branes and D2-branes, which couple to A 
and C, respectively, and D6-branes and D4-branes, which couple to their dual potentials. There 
is also a NS five-brane which couple to the potential dual to B. Our attention in the following will 
concentrate on fundamental strings and DO-branes. 

Strings are the fundamental objects of type IIA theory, while, according to Matrix Theory, DO- 
branes are the fundamental constituents of M-theory. M-theory is closely related to IIA theory 
as follows. The type IIA string coupling Qs, i.e. the exponential of the vacuum expectation value 
of the dilaton, can be used as an expansion parameter when gs is small - this is the region we 
will be concerned with in the following sections. When Qs becomes large the representation of IIA 
theory in terms of strings becomes unreliable. In fact as oo a radical change takes place 

in the theory: a new dimension opens up. This dimension is wound on a circle whose radius R 
in suitable units is given by gs] when gs is small the additional dimension is dormant, when gs 
is large it cannot be disregarded anymore. The new theory one lands on is eleven dimensional. 
This cannot be a string theory, because superstrings do not consistently propagate in IID. It is an 
entirely new theory, M theory. 

The low energy effective field theory of M theory is eleven dimensional supergravity. Its modes 
are the metric G, a 3-form tensor A and the gravitino, altogether 128 bosonic and 128 fermionic 



672 


L Bonora 


degrees of freedom. M theory contains two types of soliton-like solutions, membranes and 5~branes. 
It is easy to see how the circle compactification maps the degrees of freedom of M theory into those 
of type IIA theory. For example, 5-branes of M theory generate NS 5-branes and D4~branes of 
IIA, membranes in M theory originate fundamental strings and D2--branes of IIA. In parallel, we 
can identify and -^lmn with (P^qmniBmn and Clmni respectively. Most 

important for us is however the origin of DO-branes and the potential Am coupled to them. The 
latter is identified with Gum and the DO-branes with the massive Kaluza-Klein (KK) modes of 
mass M = I/qs ^ 1/i?. Actually the KK modes of IID supergravity have mass n/gs for any 
integer n. The case n = 0 is not a DO-brane, but just the direct massless descendants of IID 
supergravity. When n > 1 the KK modes are interpreted as bound states at threshold of the 
elementary KK modes with n = 1. The integer n is identified with the DO-brane charge; negative 
n’s label anti-DO-branes. Each such massive KK mode comes in supermultiplet of 256 physical 
components. 

For M theory there is not yet a handy formulation, similar, for example, to the one IIA theory 
has in terms of strings. At present there is a very radical proposal, still at an early stage of develop- 
ment: Matrix Theory (see [24, 25, 26, 27]). The basic objects of Matrix Theory are supergravitons, 
i.e. supermultiplets with 256 degrees of freedom to be identified with the DO-branes of IIA theory. 
To understand the Matrix Theory proposal we need to introduce preliminarily two new issues. 

The first is the effective action for systems of N D-branes. D-branes are classically submanifolds 
of the target space where open strings can end (and therefore satisfy corresponding Dirichlet 
boundary conditions). The self-interaction and the mutual interactions of a system of D-branes 
is therefore representable in terms of string theory. The request of conformal invariance provides 
equations of motion for the D-brane fields that can be integrated and lead to a Born-Infeld type 
action. WTaen the ambient metric is flat and the D-branes are almost flat (for DO-branes this means 
that their velocities are small compared to the velocity of light, which implies a non-relativistic 
framework) this action is well approximated by a super-Yang-Mills (SYM) action, obtained by 
simply reducing lOD SYM to the dimensions of the D-brane world-volume. In this non relativistic 
approximation the effective action appropriate for a system of N DO-branes is therefore lOD SYM 
reduced to one (time) dimension: 

300 = j dTTr (J^iDoXP)^ - iB'^Dad + X^]^ + e'^P[Xp, ^]) (16) 

where p = 1, . . . , 9, are N xN Hermitean matrices, tp are 16 x 16 gamma matrices of SO (9), 
d are N x N matrices whose entries are ID spinors. Moreover jDq = 9o - [Ao, ]. The quartic 
potential implies that, when IX)-branes are far away from one another, the dominating configura- 
tions are by diagonal and the eigenvalues of X^ representing the distances among the 
various branes. When the DO-branes are nearby the non-diagonal terms become relevant and we 
are faced with a non-commutative space structure. 

The second issue is the infinite momentum frame (IMF). Imagine a system of point particles 
with momenta p®, where a labels the particles, and suppose we boost this system to very high 
velocity in a ^ven direction. A simple calculation shows that the energy of the system takes the 
approadma^ form 


Ei 


>P\\ 


4 " M 


(17) 


whare the M>el || and L denote the momentum components parallel and perpendicular to the boost 
direction, respectively. Eq.(17) has a non-relativistic form. 

Now let us consider a system of DO-branes and, with reference to (17), identify the transverse 
directions of this system with the nine space directions discussed above, while the boost direction 
is identified with the positive 11th dimension of M theory (so p|| = pn and so on). When this 
system is boosted in the 11th direction we reach an effective non-relativistic regime. The idea of [9] 
is to assume that the nonrelativistic action (16) is an adequate description for such a system of iV 



Y-M Theory and Matrix String Theory 


673 


DO-branes boosted to infinite momentum. This would seem at first to be a proposal with limited 
scope. In fact an infinite boost in the positive 11th direction would seem to exclude from the 
game the DO-branes with n < 0. However, relying on examples in field theory, the authors of [9] 
suggest that the information that seem to get lost in the infinite momentum frame can actually be 
retrieved from (16). The Matrix Theory proposal is therefore that the action (16) fully represents 
M theory in the large N limit. The missing dimension in the previous counting is to be retrieved 
firom such a limit. In fact a system of N DO-branes, as the one described by (16), has p\f = ^ 
and the decompactification limit i? oo requires N oo in some appropriate way. 

That Matrix Theory represents M theory has been confirmed in a number of ways. Although 
there is still a long way to go to transform Matrix Theory into an efficient tool for computations, 
no contradiction has been found thus far with this hypothesis. MST can be thought of precisely as 
one of the most important confirmation of Matrix Theory. MST, first proposed in [13], is obtained 
in the following way. One first compactifies Matrix Theory along the 9th direction on a circle 
of radius Rq and ends up with a SYM theory in 1+1 dimensions. This operation is technically 
rather complicated, see [12], and will not be repeated here. After this operation one is left with 
two compactification radii, R = Ru along the 11th direction and Rq. One then exchanges the 9th 
and 11th direction. In the Matrix Theory limit -+ oo, one recovers the appropriate setting for 
IIA theory: ten uncompactified dimensions plus one compactified along the circle of radius Rq, 
which has become the 11th direction of M theory, [13, 16]; it is this radius which is expected to 
correspond to gg. This is what is called MST. It is expected to represent a nonperturbative version 
of type IIA theory. 

4 Matrix String Theory 

4.1 Minkowski version 

The MST is defined by the following U{N) SYM model in a 1+1 Minkowski space, specified by 
the action 

5 = -+ y d<rdT TV (d^X^D'^X* + - ^[X\ X^f - iSp^^D^d - 6]) , 

^ (18) 
where g is the gauge coupling, a and r are the world-sheet coordinate on the cylinder. with 
i = 1, . , . , 8 are hermitean N x N matrices and D^X'^ = d^X^ + ^[A^, X% is the curvature 
of 9 represents 2 N x N matrices whose entries are simultaneously 2D and 50(8) spinors. It 
can be written as 0^ = (^s,^c)j where + denotes the 2D chirality and 9s ^Oc are spinors in the 8s 
and 8c representations of 50(8), while ^ represents the 2D transposition, are again the 2D 
gamma matrices: {p^^Pi^} = -277^j/, and 9 = 9'^p^. The matrices Pi are the 16 x 16 50(8) gamma 
matrices. The remaining conventions are as in (15). 

For definiteness we will write the matrices p^ and Fi in the form 



and 7 i, 7 i are the same as in Appendix 5B of [7]. 

The action (18) is invariant under the supersymmetric transformations 

<5x* = -ere 
9 

68 = - '-[X\X^]Tije - ^p^D^X^p^Tie 

zg ^ 9 

5A^j, = —iep^9 , 

where = {cs, Cc) are 8+8 transformation parameters. 


( 20 ) 



674 


L Bonora 


4.2 Euclidean version 

We make a Wick rotation and introduce the complex coordinates 


w 


1 1 

-(r + ia), w = -iT-ia), 


Ayi = Aq — iAi, Anj — Aq + iAi . 


The action becomes 

+ i{BsDis,6, + BcD^Bc) + i99'^Ti[X\ 0 ]) , ( 21 ) 

where C is the infinite Euclidean cylinder spanned by w. The supersymmetry transformations take 
the form 

6X' = -{e-^% + €tfe,) 

9 

Sds = + \[X\X^iii)t- - \d^XH4 

2g^ 2 g 

sec = - ^DicX%e: 

SA^ = -2€jes, SAin = -2et6c , ( 22 ) 

where 

7ii = ^(7i7i - 7i7i), lij = ^(7i7j “ 7j7i) • 

A string interpretation, which is what we want to arrive at, is more natural after the coordinate 
transformation, already considered above, tn — > z = e^, i.e. after passing from the cylinder to the 
complex plane with the origin deleted, i.e. C* , or the Riemann sphere CP^ with two punctures. 


4.3 The string interpretation 

Let us see how a simple string interpretation arises if we take a naive strong coupling limit in (21). 
We rescale A -> and imagine the fields represent small oscillations about a trivial background. 
The naive strong coupling limit {g -4 oo) in the action tells us that all fields commute, therefore 
they can be simultaneously diagonalized. The formal aspect of the resulting theory is that of the 
Green-Schwarz superstring theory in the light-cone approach, (15); however there is a significant 
variant. The fields are the eigenvalues of Xi and 6. Therefore for each field of the Green-Schwarz 
theory we have here N fields. We will see later on a completely satisfactory interpretation of 
this fact. For the time being let iis notice that the theory becomes a free theory of the diagonal 
d^rees of fireedom. The gauge freedom can be completely fixed up to a residual gauge invariance 
which takes the form of the Weyl group, i.e. the permutation group of N elements. We are 
therefore allowed to fix the boundary conditions for the diagonal degrees of freedom up to this 
residual gauge transformation. On this basis th^ degrees of fireedom lend themselves naturally 
to an interpretation as free strings of various lengths. For example, let the eigenvalues of be 
X* = Diagf^i , . , . , x}^) and let us consider the effect on such configuration of the element 




0 

1\ 


1 

0 

0 

v = 

0 

1 0 ... 

0 


Vo’ 

0 ... 1 

0 y 


( 23 ) 



Y-M Theory and Matrix String Theory 


675 


of the Weyl group. The boundary condition X^{27c) — PX^{Q)V~^ implies that x].{2tt) = 
and so the rrj, form a unique long string of length 27tN. Of course there are many other possible 
configurations, beside the one determined by (23), with strings of any length, each corresponding 
to a conjugacy class of the \\"eyl group, [16]. 

The length of a string, in the above sense, has a precise physical meaning. Recall that the 
correspondence with the Green-Schwarz theory is established in the light-cone gauge. In this 
framework the length of a string is identified with the p'^ = + p® component of the string 

center of mass momentum, where 0, 9 are the time and longitudinal direction, which do not appear 
explicitly in (18). 

The naive strong coupling limit of (18) is therefore a free theory of closed superstring of various 
lengths- This will be confirmed by a more careful analysis later on. For the time being we note 
that, if the limit we have described is the true strong coupling limit, we can interpret it as the 
weak coupling limit of (type II) string theory, i.e. if is the string coupling, Qs ^ g'~^, [10, 13]. 

5 Riemannian Instantons and their construction 

The String interpretation of the previous subsection will be confirmed by an analysis of the theory 
in the background of its instantons. Let us, preliminarily, describe such classical (supersymmetric) 
solutions. In fact we will look for classical Euclidean supersymmetric configurations that preserve 
half supersymmetry. To this end we set 0 = 0 and look for solutions of the equations S6^ = 0, i.e. 
from (22), 


= 0 , = 0 

+ \[X\Xi]^ii)t- = 0, D^X%e; = 0 . . . (24) 

Solutions of these equations that preserve half supersymmetry are the following ones. Set = 0 
for all i except two, for definiteness A"* 7^ 0 for z = 1,2; remark that 712 is an antisymmetric 8x8 
matrix, and 7^2 = -1 ^md therefore its eigenvalues are ±i ( moreover 712 = 712)- It is easy to 
show that there exists and €~, each with four independent components, such that 

7i2€^ = ±ie^, 7i€'^ = -n2€^, 7ie~ = . 

Now it is convenient to introduce the complex notation X X^ — iX^, X = X^ + iX^ = X^. 
Then the conditions to be satisfied in order to preserve half supersymmetry are 

F^^-zy[X,A] = 0 (25) 

Di,X = Q, D^X^O. (26) 

These are the same equations as (2), but they have been obtained in the context of MST. The 
solutions of these equations preserve half supersymmetry. For this reason they are also referred to 
as BPS instantons. 

It is easy to verify that, if non-trivial solutions to such equations exist, they satisfy the equations 
of motion of the action (21). The normalized^ instanton action is the action with 6 = 0, A* = 0 
for z = 3, . . . 8 

Sinst = -^ I (-XD^,D^X - XD^D^X - + ^-[X, X]^) . (27) 

It is elementary to prove that Sinst vanishes in correspondence with smooth solutions of (25, 26). 

From a mathematical viewpoint, (25, 26) are easily seen to identify a Hitchin system [28, 29] 
on a sphere with two punctures. In such systems, F is the gauge curvature in reference to a gauge 


^The action ( 27 ) is normalized so as to avoid boundary terms due to partial integration. 



676 


L Bonora 


vector bundle V, and X is the holomorphic section of the bundle EndV <8 K, where K is the 
line bundle over the base (which is trivial in our case). Hitchin systems have appeared 
both in the mathematical and the physical literature, [30, 32, 31, 33]. 

One first remarkable property of the Hitchin systems is that they form an integrable system of 
equations. To see this let us show that they satisfy a zero curvature condition. Let us define the 
spectral connection 

Aw = Aw + ^gX, Aw — Aw — ^X , (28) 

where A is a spectral parameter. We can rewrite (2) as the zero curvature condition for the 
connection Aw 

Xww “ BwA^ ~ dwA-w d” f .^to] 

= {Fww + ig'^ ^ ) + A5 A') - I {DwTj = 0 , 
for generic values of the spectral parameter. 


5,1 General ansatz for Riemannian instantons 


In order to find solutions (A,X) of (25,26), we follow and generalize the example N = 2 presented 


in section 2. We start from the analogous ansatz 

A,, =ia,.Y^(y-l)^ X=Y-^MY (29) 

where F is a generic element in the complex group 5L(iV, C) and M specifies a branched covering 
of the cylinder. As a consequence of (29) the equation D^X = 0 is equivalent to d^M = 0 or 

dsM = 0. (30) 

With this parametrization, the spectral connection becomes 

Aw = idwY^{Y-^)^ + XgY-'^MY, Aw = -iY-'^dwY - ^Y^ M^Y^yK (31) 


from which it is easy to extract the zero curvature equation and see that it is written in terms of 
FFt only. 

This parametrization is defined therefore in terms of two factors Y and M. As above, Y will 
be referred to as the group ikeoretical factory while M defines a general branched covering of the 
cylinder. The factor Y will be discussed below, while branched coverings will be discussed at 
length in section 7. For the time being let us give some essential information. Let us consider, the 
polynomial 

N-l 

Px{y} = Det(j/ - AT) = ^ y'oi , 

i=0 

where y is dk (xmaplex indeterminate. The equation 


Pxiy) = 0 (32) 

can also be written as the matrix equation 

X^ + oat-i ^ 1- oo = 0 . (33) 

A diagonalizable matrix, which is solution of eq. (33), can always be cast in the canonical form 


M = 


f — ajv-i — ajv-2 • ■ ■ 
1 0 
0 1 0 


V 0 


0 


-ao\ 
0 
0 

0 y 


(34) 



Y-M Theory and Matrix String Theory 


677 


The case considered in section 2 is evidently a particular case of the above formulas. 

Due to (30), we have dzUi = 0, which means that the set of functions {ai} are analytic in the 
complex plane, although they are allowed to have poles at z = 0 and 2 = oo. The point is that, as 
we shall see in many examples, Eq.(32) identifies in the (z,y) space a Riemann surface S, which is 
an N-sheeted branched covering of the cylinder. Generalizing what has been said in section 2, the 
explicit form of the covering is given by the set of eigenvalues of X. Each 

eigenvalue spans a sheet. The projection map to the base cylinder C will be denoted tt : E C. 
The points where two or more eigenvalues coincide are called branch points. The identification 
cuts in the sheets start or end at these points (which may include 0 and 00 ). We stress that the 
covering is independent of the coupling g. 

The Riemann surface S will be characterized by the genus h and by a certain number n of 
punctures. Each puncture will come associated with an integer, the length of the string entering or 
exiting at that puncture. Moreover each surface is characterized by certain numbers, the moduli. 
All this information is stored in the ai analytic functions and will be worked out in sections 7 and 
8 . 

5.2 Construction of instanton solutions 

Each solution of (25), (26) consists of two parts: a branched covering of the cylinder via the relative 
X characteristic polynomial and a group theoretical factor. The aim of the present subsection is 
to generalize the example of section 2, by constructing the instanton solutions corresponding to 
the most general covering. 

Let us recall our ansatz (29). The group theoretical factor Y takes values in the complex group 
5L(iV’,C), while the matrix M determines the branched covering. The dependence on the Yang- 
Mills coupling constant g is contained in the Y factor, while M does not depend on g. In section 2 
we have shown an example in which Y = KL where L, the dressing factor, tends to 1 in the strong 
coupling limit outside the string interaction points, while K is a special matrix, independent of g, 
endowed with the property that K~^MK and are simultaneously diagonalizable. 

We will proceed in a parallel way also in the general case. It is well-known, [17], that the matrix 
M can be diagonalized 


M = SMS-\ 

M = Diag(Ai,. ..,A7 v) 

(35) 

by means of the following matrix S G SL{N,C): 


, f^h-2 

1 1 

to >-* 

1 1 

to t-* 


S = A-^ 

^2 '^N 

(36) 

V T 

'i ::: 'r J 



where 

A= n (37) 

l<i<j<N 

We notice first that the role of a in section 2 is played now by A, which vanishes whenever two 
eigenvalues coincide. Two coinciding eigenvalues define a branch point of the covering. We will 
return later on to this issue. For the time being let us notice that, like in section 2, going around 
a branch point in the complex 2 -plane, produces a reshuffling of the eigenvalues that can be 
represented via a monodromy matrix A; M -> AMA"^. Correspondingly we have S -4 JA"^, so 
that the single-valuedness of M is preserved. 

The construction of K and L in the general case is subtle and technically rather complicated. 
The explicit construction of K and L are given in [21] and will not be reported here. However, on 
the basis of the example of section 2, it is not difficult to understand the general features of these 
matrices. 



678 


L Bonora 


First one introduces a monodromy-invariant K such that K~^S = {/ be unitary. As it turns 
out, K may have singularities at the points of C where any two eigenvalues of M coincide, i.e. 
at the branch points of the spectral covering (the elements of K contains as factors fractional 
powers of |A|, much in the same way as K in (5) contains fractional powers of |a|). Therefore 
K^^MK is in general singular at these points. That is why we must introduce into the game a 
new monodromy invariant matrix L, with the purpose of canceling the singularities of AT ^ MAT in 
such a way that L''^K''^MKL be smooth and satisfy (25), (26). The entries of L can be taken to 
be generalizations of the u field of section 2. Let us denote again by u any such field. For (25), (26) 
to be satisfied these fields u must satisfy, [21], an equation of the WZNW type with the following 
general structure 

d^dijjU -b ... -- 9u.,5u>ln|A| = 

where dots represent all the other terms, which are irrelevant in the cancellation of singularities. 
Let us refer to these equations as the 'dressing equations’. On the right-hand side we see the typical 
delta-function-type source which characterizes them. The sources are point-like and located at the 
zeroes of A, that is at the branch points of the covering. 

By construction K is independent of g while L does depend on g. One can show that in fact 
1/ 1 as ^ -4 00 . Let us present a simple argument in this sense. 

The solution X exists with the required properties only if the 'dressing equations’ admit so- 
lutions that vanish at tu = ±oo. To our best knowledge, not much is known in the literature 
concerning the existence of such solutions. Based on the analysis of [20], w^e assume that the 
‘dressing equations’ do admit solutions that vanish at tn = ±oo. Once one assumes this, it is 
rather easy to argue, on a completely general ground, that in the strong coupling limit, p oo, 
such solutions vanish outside the zeroes of the discriminant. The argument goes as follows. Con- 
sider a candidate solution of (25) in which it = 0 outside the zeroes of the discriminant, for all the 
u’s. Then, there, L = 1, and X = K~~^MK, As noted previously, in such a situation [X,X] = 0, 
since both X and X are simultaneously diagonalized by the matrix U = K~^S. Now we have to 
show that also vanishes outside the zeroes of the discriminant if L = 1. In fact when L = 1, 

where But duS = 0 due to holomorphicity of the eigenvalues of M. In conclusion 

(25) is identically satisfied by the ansatz L = 1 outside the zeroes of the discriminant. Since the 
solutions are uniquely determined by their boundary conditions, we can conclude that, as p oo, 
the only solution of the dressing equations outside the zeroes of the discriminant, is the identically 
vanishing solution. We infer from this argument that the solutions of the dressing equations for 
large g are concentrated around the branch points and become more and more spiky as g grows 
larger and larger. Therefore the matrix L has the properties we expect. 

The previous argument hinges on the occurrence that as p = oo we have both [X, A] = 0 and 
== 0. Other types of solutions can be envisaged, see [21]. However here we limit ourselves 
precisely to those solutions of the Hitchin equations (25), (26) for which at p = oo we have both 
[X, X] = 0 and = 0 and call them stringy instantons, since they represent interpolating 
solutions between genuine initial and final string configurations. 

To end this section let us emphasize the double ‘miracle’ of the above construction: we construct 
an everywhere smooth solution by means of two non smooth matrices K and L, which are such 
that on the one hand L 1 as p oo and on the other hand K form together with S a unitary 
matrix U = so that as p -4 oo, X — > UMU^^. This will be adequately appreciated in the 

next section. 

5*3 Summary 

In this section we have seen that utilizing a very general recipe, an instanton solution can be 
constructed in correspondence with the most general branched covering of the base cylinder, i.e. to 



Y-M Theory and Matrix String Theory 


679 


any Riemann surface with punctures that can be represented as a branched covering of the cylinder 
we can associate an instanton. We have of course still to clarify how general these Riemann surfaces 
are. This requires a more explicit analysis of branched coverings. At this point, however, before we 
do that, we believe it more useful to first see how string}^ instantons can be used to compute physical 
processes. This will clarify the physical setup of our discussion and what branched coverings we 
really have to describe and moreover it will introduce a new input in the discussion: the moduli 
space of stringy instantons. 


6 MST on a Riemannian instanton background 

The aim of this section is to show how the above construction of stringy instantons can be trans- 
formed into a calculationai tool. In particular we would like to interpret an amplitude on a given 
instanton background as an amplitude for a string process characterized by the Riemann surface 
which underlies the instanton. 

Below we show that MST in the strong coupling limit in the background of a given stringy 
instanton solution reduces to the Green-Schwarz superstring theory plus a decoupled Maxwell 
theory, and compute string interaction amplitudes in such background. Since the latter interpolates 
between an initial and a final string configuration via a punctured Riemann surface E (which 
represents a branched covering of the base cylinder), the amplitudes can be interpreted in a string 
theoretic way as the transition amplitudes between two such configurations. The purpose of this 
section is to show that their leading term is proportional to where x = 2 — 2/i - n is the 
Euler characteristic of the Riemann surface of genus h with n punctures, which characterizes the 
given classical solution. This is the result one expects from perturbative string interaction theory 
and proves beyond any doubt the relation between MST at strong coupling and string interaction 
theory. 


6.1 Expansion about a Riemannian instanton 

What we want to do is to split every field $ in the action (21) into a classical instanton background 
part and a quantum fluctuating part 


§ = ^ ( 39 ) 

and expand the action. The instanton action vanishes, as we have seen above, and the piece of 
action linear in (f> vanishes due to the equations of motion. We want to evaluate what remains in 
the strong coupling limit, by considering an expansion of the action in 1/^. 

As a first step let us analyze the background part. To this end let us recall a few facts from the 
previous section. The dependence on the coupling is entirely contained in the factor i. We have 
seen that in the strong coupling limit L —)■ 1 outside the branch points of the covering. Since here 
we are interested in expanding the action (21) in inverse powers of 1/g, and actually in singling 
out the dominant term in this expansion (see below), we will consider the action (21) around a 
given classical solution stripped of the above dressing factor, and exclude from the integration 
region the branch points on the cylinder. In other words we will consider from now on the action 
(21) in which the relevant Y is replaced by K and the integral extends over Co which is the initial 
cylinder C from which small disks have been cut out around the branch points. Said otherwise, we 
introduce in our integrated action a regulator (which will eventually be removed). 

After getting rid of the dressing factor, the classical background configuration is specified by 
X = K~^MK and As expected, this configuration is singular exactly at the 

branch points. We have seen that M = M is the matrix of eigenvalues of M and of X, 

so we denote it equivalently by X. In the strong coupling limit X UXU^^, where U = K''^S 
is a unitary matrix and therefore simultaneously diagonalizes X and X. U is singular at the 
branch points both in the sense that it may diverge there and that it undergoes a^monodromy 
transformation upon going around a branch point. Corresponding to X we have which, as 



680 


L Bonora 


was shown above, vanishes everywhere, even at the branch points. What has happened is that the 
unitary transformation has swallowed entirely the connection, including the singularities. 

U is singular at the branch points, but finite (and multi-valued) in Co- Therefore, with a gauge 
transformation, we can remove the unitary factor U from the action defined in Cq. This leads us 
to the 

• diagonal representation: X diagonal and ^4 = 0. 

for the classical background in the strong coupling limit. 

A comment about our use of singular gauge transformations is in order. As we have stressed 
several times, the classical configuration specified by a given couple (X, A) is smooth on the initial 
cylinder C, but if we strip the solution of the dressing factor we get a configuration which is 
singular exactly at the branch points. The dressing factor is there exactly to compensate for 
these singularities, i - 1 has support only at the branch points in the strong coupling limit. 
Therefore we replace C with Co by excluding the branch points from the integration region to 
preserve smoothness. This is justified by the following consideration. Beside the initial smooth 
configuration (X, A) on C, we will meet another smooth situation* when we lift our theory to the 
branched covering S of C (see below). At that stage the regulator can be removed and the branch 
points can be restored in the integration region. What has happened is that, in order to pass to 
the covering, we need a singular gauge transformation, the U used above, which exactly kills the 
singularity exposed by the strong coupling limit. The transformation U is not single-valued, since 
it picks up a phase when crossing a cut. In other words, it is natural to perform this singular gauge 
transformation if we want to reach a smooth situation which is fit for field theory. 

Let us return now to the action (21) with the above imderstanding of the background part. To 
extract the strong coupling eflFective theory, we first rewrite the action in the following useful form 

5 = ^JcPwTt (d^X^D^X^ - ^[X\X-’f - g‘^[X^,X\[X\Y\ - '^XD^Di.X 

-^XD^D^X - ^ + i{6,D^e, + + ige^Tiix^ e]j , 

where I = 3, 4, ..,,8. We now expand the action around a generic instanton configuration as in 
(39), but we further split the quantum field <!> 

where is the background value of the field at infinite coupling, are the fluctuations along the 

Cartan directions and are the fluctuations along the complementary directions in Lie algebra 
u{N). Of course only the upper case fields X,X and A will have in general non zero background 
value. In the following we supper we have carried out the operation described above and by 
background value we refer to the diagonal representation. 

As we have already noticed, the expansion of the action starts with quadratic terms in the 
fluctuations. 

6.2 Fixing the gauge and integrating along the non-Cartan directions 

To pnx^ed further let us fix the gauge. We use, in the strong coupling limit, the following gauge, 
inspired by the one used in [35] 

Gww = + igH[X°, x] + [X°, x]) + 2ig^[X°^, x^] = 0 , (41) 

where D° is the covariant derivative with respect to A° Next we introduce the Faddeev-Popov 
ghost and antighost fields c and c and expand them like all the other fields. Then we add to the 
action the gauge fixing term 




( 42 ) 



Y-M Theory and Matrix String Theory 


681 


and the corresponding Faddeev-Popov ghost term 

where 5 represents the gauge transformation with parameter c. 

At this point, to single out the strong coupling limit of the action, we rescale the fields in 
appropriate manner. Precisely, we redefine our fields as follows 

A^=gal+al, Z=X + i‘ + -i", + ix^", 

9 9 s/9 

and likewise for the conjugate variables. For the ghosts we set 

c = gc^ c = gc^+ . 

V9 

One may wonder why we choose such rescalings and not others. A partial answer is in the remark 
that these rescalings introduce a unit Jacobian in the path integral measure of the non-zero modes, 
although they may produce a non-trivial factor due to the presence of zero modes (see below). 
However this is not a sufficient criterion to completely fix the rescalings. Let us say that this 
operation leads us to a sensible strong coupling limit and we can take it as our definition of the 
strong coupling theory. 

After these rescalings the action becomes 


where 


S = Ssc + “{" o 




Tr -h d^x^d^x^ -b i{9ldiju6l + 9ldyj6^^) 

~\~dyja^d^CL^ “b dxpC d^jC J 


(44) 


Sn is the purely quadratic term in the (f>^ fluctuations. The latter can be easily integrated over 
and, since they do not involve zero modes, give exactly 1. Let us see this in detail. 

Sn has the form 


5 


n 


J d^wTr 


x^Qx" + x^^Qx^" + alQal + 



where 


(45) 


— ad o * adx*® d” ad^t_ * ad^^ ~b * ad^n 

and 

( ^ado^ jMxoi A 

\^7iadjfoi ^a,da;^ J 

In the path integral we can now integrate over the non-Cartan modes and obtain a ratio of determi- 
nants of A and Q. Since these operators do not have zero modes the calculation is elementary. The 
integration over and the conjugates exactly cancels the integration over and the conjugates. 

What remains is a ratio ((Det.4)^®/(DetQ)®)^ ^ The expression of the numerator is formal: 

one should understand Det«4 as '\/Det(-'^v4l). But AA^ = A^A = — Q. Therefore the net result 
of integrating over the non-Cartan modes is 1. This is the result expected from supersymmetry in 
the absence of zero modes. 

In conclusion, in the strong coupling limit we are left with the quadratic action (44) over the 
Cartan modes. 



682 


L Bonora 


6-3 Lifting to the branched covering 

Let us now show that the effective theory we obtained in the previous subsection corresponds 
to the Green-Schwarz superstring plus a free Maxwell action on the worldsheet identified by the 
branched covering of the rele\^t background. To this end we recall the quadratic action (44). 
Since all the matrices involved are diagonal we can rewrite this action in terms of the diagonal 
modes (j>^ = 

Ssc--f i{^s(n)dwOs{n) + ^c(n)5i£;^c(n)) 

TT /co L ^ ^ ^ ^ 

'^^w^wln)^w^w{n) "h * (46) 

This is a theory of free fields on Co and it is tempting to extend the action to C by just forgetting 
the punctures on the cylinder corresponding to the branch points. How^ever this is not correct. 
The fields x* are not single-valued on the cylinder. For example, upon going around a simple 
branch point, at least one is mapped to another one, and this is precisely the way a joining or a 
splitting of two strings is represented in this formalism, [18] [19]. There are possibly many joining 
and splitting of strings in the process spanned by the instantons under consideration, and we can 
repeat the above word by word for any string interaction point. What is suggested here is that the 
fields x* (as well as the others) are not well defined on each sheet, but all together they form a well 
defined field on the covering surface. Mathematically, this problem can be rephrased as follows: 
the fields in (46) are not section of bimdles over C; however they can be combined to form sections 
of line bundles on the covering S of C. 

At this point it is worth spending a few words about Hitchin systems. The Hitchin systems 
we are interested in are defined starting from a U{N) vector bundle V over C, associated with the 
fundamental representation of U (N). They consist of couples (A, X) w’here A is a gauge connection 
and X a section of EndV ® A', where K is the canonical bundle of C, which satisfy (25) and (26), 
[28]. Such systems can be lifted to an iV-branched covering of C, [29], [30], [32]. A remarkable feature 
of the lifting is the appearance on the branched covering of a line bundle L constructed out of V 
and from w^hich in turn V can be reconstructed. In simple words the initial non-Abelian system 
can be described by an equivalent Abelian system on the branched covering. 

This is the same situation w^e are faced here. In fact let us look at the realization of local 
fields on a Riemann surface represented as a branched covering. If S is a branched covering of the 
cylinder C, then, as we have seen, there is a projection map tt : S C whose inverse image is 
N*-valued. In our language this is simply 

TT"^ : tu (a:(i)(T£;),...,X(^)(to)) . (47) 

Suppose a local complex field ^ is given on S; applying the above construction ^ can be represented 
as an N-tuple (Vfi){u;),.. . ,^(Ar)(u;)) representing the field, on each copy of the cylinder C that 
composes the covering S; the t^’(*)(ty)’s are related by the appropriate monodromy properties along 
the cuts. 

Going now back to the action (46), we see that we have to interpret any set of N fields 
, p|.v)) Itt it as a unique field ^ on the covering S. ^ is locally a function of a coordinate 
c in S. From the point of view of S, the w coordinate is locally defined via an abelian differential 
u; = dw with imaginary periods, which is canonical, i.e. is fixed only by the complex structure of 
the surface (see [39], [40] and below). 

Finally we can write the strong coupling action (46) as follows 

Sfc = SBs + (48) 

^GS (dzX^d^x^ + i{9sdsh + Scdzh)) (49) 

s Maxwell = ^ ^ {9""^ d~a,dsa^ + d^W^c) . (50) 



Y-M Theory and Matrix String Theory 


683 


In (49) a (resp. factor has been absorbed in dg (resp. 9c) which is a (|,0) (resp. (0, |)) 
differential on S and the metric in the Maxwell term is gzz = In the integrals in (49) we 

have ignored the existence of small discs cut out around the branch points. This is allowed since 
everything now is smooth in S (we can remove the regulator introduced above). 

Summarizing, what we obtained in this subsection is that the strong coupling effective theory 
is given by the Green-Schwarz superstring action on the branched covering worldsheet plus a 
decoupled Maxwell theory on the same surface. The fields in (49) are now^ sections of line bundles 
over E, i.e. they are well defined fields on the Riemann surface: for example d is a section of the 
canonical bundle of E, and so on. 

6.4 String amplitudes 

Let us compute first, for simplicity, the vacuum to vacuum amplitude of the SYM theory in the 
strong coupling limit in the background of a given instanton. As we have already pointed out several 
times, this amplitude (up to the vertex string insertions, see below) has a string interpretation as 
the amplitude for the transition from the initial to the final string configuration described by 
the instanton. If this interpretation is correct this amplitude, to the leading order, should be 
proportional to where x is the Euler characteristic of the Riemann surface S, i.e. the covering 
surface introduced above. 

What remains for us to do in order to evaluate this amplitude is to integrate over the Cartan 
modes in the functional integral with action (49) (the non-Cartan modes have been integrated out 
above). Since the action is free, the integration produces a ratio of determinants, which turns out 
to be a constant. However we have to take account of the zero modes for the fields that have been 
rescaled (the unrescaled zero modes are irrelevant in this argument). The rescaled fields in C are 
the Maxwell and the ghost fields. The corresponding fields in E will be rescaled too 

hz ghz , ^ 9 ^ c gc, c gd. (51) 

Therefore let us single out the Maxwell (plus ghost) partition function. We will show^ that the 
decoupled U{1) theory is there to generate the stringy factor g^^ as a consequence of the rescaling 
(51) 

In fact under this rescaling the Maxwell partition function (a = a^, a = a*) 

Z Maxwell = J ‘D [d,a, c,5] 

rescales with a factor depending on the zero modes. Roughly speaking, what happens is that the 
above integral is interpreted as the ratio (Det'Dc)/(Det'Da), where O = dd denotes the quadratic 
operator in the action, and ' means that the zero modes have been excluded from the computation 
of the regularized determinants; since we have rescaled the measure there will arise a factor of g 
to a power equal to the unbalance of the zero modes (a more precise account of this point can 
be found in [36]). The problem is therefore to count the latter. As for the ghost fields which are 
scalars, the only zero modes of the d operator on E is the constant. The zero modes of the Maxwell 
fields correspond to the holomorphic differential on S. If E were a closed Riemann surface of genus 
h, their number would be h. Their counting in the present case is not completely standard as 
E is actually a Riemann surface with punctures (representing the in- and out- strings). For the 
purpose of such counting we can replace punctures with boimdaries, since the Euler characteristic 
does not change. A way to do the counting is to construct the double S of S: E has genus 
h = 2h + 6 - 1, where b is the number of boundaries, and admits an anticonformal involution with 
the set of fixed points corresponding exactly to the boundary of E. We can count now the number 
of analytic differential on E that extend to S, that is the so-called analytic Schottky differentials 
[37]: their number is h. Therefore the overall unbalance of zero modes (including the ghosts) is 
= + (or, equivalently, 2/i -h n ~ 2, if n is the number of punctures, which is more 

appropriate in our case). This is exactly the opposite of the Euler number of S. An equivalent 



684 


L Bonora 


way of deriving this result is to use the Gauss-Bonnet theorem on £ and noting that, due to the 
involution, the integral of the curvature over S is one half of the total contribution. 

Finally the factor in front of the vacuum to vacuum amplitude will be = ^ 2 /i+n -2 

The exponent of g is precisely the Euler characteristic of S, as we wanted to prove. 

In order to appreciate exactly what we have just computed we must now specify what it 
corresponds to in string interaction theory. In this sense the amplitude we have just computed in 
the strong coupling limit is a basic amplitude but, of course, an incomplete one. 

First of all, real string amplitudes should contain vertex string insertions, i.e. should be cor- 
relators of the vertex operators corresponding to the various in- and out- (super) strings. In this 
regard we simply remark that such vertex operators are constructed in terms of the string fields 
and 6 , therefore the treatment of the non-Cartan modes above is not affected and the discussion 
of the zero modes of the Maxwell sector is unchanged. Therefore the scaling factor is left 
unchanged too. 

Moreover, in order to obtain complete amplitudes, we must still integrate over the moduli of 
•the Hitchin systems, i.e. over the inequivalent Riemannian instantons that interpolate between 
a given initial and a given final state. If we want to implement such more advanced stage of 
calculation, we have to take into account in the measure some Jacobians that are produced by 
the various field splittings we have considered above. Following [88], the background/fluctuations 
splitting of the fields in the path integral generates a Jacobian J^/f, In an analogous way also the 
Cartan/non-Cartan splitting gives rise to another Jacobian factor JcfnC- These factors are easily 
seen to depend only on the Cartan modes of X and 6. This, in particular, implies the validity of 
the procedure we used for the integration over the non-Cartan modes. 

So let us introduce the vertex operators Pi , . . , , corresponding to n incoming and outgoing 
strings, expressed in terms of 5, 0, and of the string transverse momenta, and insert them into the 
path integral. The genus h amplitude (in the strong coupling limit) will schematically be: 

(Vi, . . . , Vn)H = 97^ f dm f V[xA d^Ji/fJc/ncVi . . . ' , (52) 

We have singled out the integration over namely over all distinct instantons which underlie 

the given string proems for fixed iV, that is to say with assigned incoming and outgoing strings 
and string interactions. In ordinary string interaction theory is nothing but the moduli 

space of Riemann surfaces of genus h with n punctures, a complex space of dimension 3/i - 3 + n. 

What actually is in MST is the main subject of the next sections. 

However, before we pass to this subject, we owe one more comment on the Maxwell action. As 
we have already said, the role of SMaweii hi (52) is to ensure the correct factor gj''^ in front of the 
amplitude. For the rest its integration simply gives a number in front of the amplitude, since the 
Maxwell and ghc^t modes do not interact with the other modes. 

6.5 Summary 

In this section we have seen that by expanding the MST action about a Riemannian instanton one 
getSf in the strong coupling limit j the Green-Schwarz action plus the free Maxwell action over the 
Riemann surface supporting the instanton. If this Riemann surface has genus h and n punctures^ 
the path integral is proportional to a factor g-^h-n +2 _ ^ 2 /n-n ~2 correct factor one 

expats from string interaction theorpf for a string process mediated by such a Riemann surface. 
What remaii^ now to examine is how general this result is compared to what is required by string 
interaction theory. To this end a closer analysis of branched coverings cannot be further postponed. 


7 Riemann Surfaces as Plane Curves 

It is time to turn to a more cexefnl description of the second ingredient of stringy istantons, i.e. 
to branched coverings of the cylinder (or the Riemann sphere with two punctures). This section 



Y-M Theory and Matrix String Theory 


685 


as weli as the following one is rather technical. By this I mean that the relevant mathematical 
concepts and terminology are not frequently met, for the time being, in the theoretical physics 
literature. However, even though I will try to keep a low lexical profile, a complete understanding 
of the problems we intend to cope with here requires the use of such concepts and terminology. 

The purpose of this section is to explicitly show how the Riemann surfaces that are necessary in 
string interaction theory arise as branched coverings of the cylinder. To this end let us return to the 
definition in section 3.5, in particular eq.(32) or (33). Branched coverings make their appearance 
in MST as solutions of affine equations 

Piy, ^) = Y = 0, iy,z) , (53) 

p,g 

where P is a polynomial of degree N. Actually, from eq. (32) it follows that Px has degree N in 
y, but the ai(z)'s could be any analytic functions on the punctured Riemann sphere. This means 
that they could be expressed by means of Laurent series in z. However in order to preserve the 
string interpretation we will limit ourselves to ai(z)’s which are Laurent polynomials. Even more, 
in the following we will explicitly consider only ai(z)'s which are polynomials in z in such a way 
that P(y,z) has overall degree N, This renders our discussion less general but far simpler. 

The locus in C? of the solutions (y, z) of (53) is a plane curve. The independent non- vanishing 
coefficients ap^g can be varied without changing, in general, the topological type (h, n) of the 
curve, where h is the genus and n is the number of punctures of the curve. They are the moduli 
of the plane curve. Counting them is necessary in order to see whether the moduli space of MST 
coincides with the moduli space of IIA superstring theory, or more realistically to what extent 
approximates . 

In this section we discuss plane curves, in the next section their moduli. 

The literature on plane curves, and, more generally, on algebraic curves is vast (see for instance 
[41, 42, 43]), and we will be using many well-known results. However one should bear in mind a 
peculiarity of our problem which is not usually considered in the textbooks on the subject: the 
question of punctures. So let us discuss first how to generate punctures. Later on we will see how 
to produce curves with non-zero genus. It will soon be clear that the generic plane curves from 
MST are in fact singular. 

7.1 Punctures on plane curves 

We recall that we interpret the Riemann surface defined by the relevant branched covering of the 
cylinder as the classical carrier of a string process. The basic information about branched coverings 
is of course determined by the branch points: this information is contained in the discriminant. 
The discriminant 5 of (32) is proportional to A^, where A was defined in (37). The zeroes of the 
discriminant define the branch points and their multiplicity gives the multiplicity of the branch 
points, where the multiplicity or ramification index of a branch point is defined as the number of 
sheets which come together at that point, minus one; therefore branch points that involve only two 
sheets are called simple. 

The branch points at z 0, oo represent joining and splitting processes of the string. Generi- 
cally, when the branch point is simple’, we have the joining of two strings to form a unique string 
or the splitting of one string into two. We may also have multiple branch points, in which more 
then two incoming or outgoing strings are involved. However the latter are limiting cases of the 
former and, from now on, as far as branch points at z 0 , 00 are concerned, the emphasis will be 
on simple branch points. 

The inverse images under tt of z = 0, z = 00 are punctures in S with a definite string interpre- 
tation: they represent the points where incoming strings enter (outgoing strings leave) the process 
represented by the Riemann surface E. 

It has to be kept in mind that , in MST, the counterimages of z = 0 and z = 00 are distinguished 
points with an associated physical meaning. This is to be contrasted with the usual mathematical 



686 


L Bonora 


treatment of branched coverings of CP\ where these points do not play any particular role. This 
remark will become extremely important below, in connection with the discussion about moduli 
space. 

Let us discuss further properties of punctures corresponding to z = 0 (an analogous discussion 
holds for z = oo). The counterimages of 2 : = 0 by may be N distinct points, i.e. the solutions of 
the algebraic equation (32) at z = 0 may be ail distinct. In such a case we say we have N small 
incoming strings (of length 1 each). This is the case of the two incoming strings of the example 
considered in section 2. However, in general, the inverse image of z = 0 may contain several branch 
points . . . , with multiplicity ii - 1, . . . , Z 5 - 1 , respectively (if z = 0 is a singular point 
of eq. (32) it has to be desingularized first, see below). In this case the process represented by 
S involves $ incoming strings of length li, . . . , respectively. The physical interpretation of the 
string length has been given in [40, 13]. In the framework of the light-cone quantization of type 
IIA superstring, the string length is identified with the momentum component of the 

string in suitably normalized units. Here 0,9 are of course the time and longitudinal direction of 
the ambient space, which do not explicitly appear in (21). 

Let us see an example. Suppose y = 2/1 is a branch point of multiplicity Z — 1 in the counterimage 
of z = 0. This means that we have I roots of (32). For example, i/i -f i = 0, . . . , Z - 1 

and 7] = exp{27rill). In other words Z sheets of the covering join along a cut starting at yi. The 
counterimage of a circle around z = 0 in the z-plane contains a curve around y = yi on the covering 
that closes after crossing the cut Z times, i.e. we have an incoming string of length Z. Therefore 
an easy rule to compute the length of an asymptotic string at a branch point in the inverse image 
of z = 0 is to count the number of sheets that meet there. Alternatively such length can be seen 
as the period of the differential cZ In z around the point y = yi oi the covering. In fact y - yi is a 
good coordinate near 2/1 and dlnz = k dliL{y — yi). The same conclusion can be drawn if the roots 
are like ^ 2/i 4- rfz^^^y where j and I are relatively prime integers. A similar discussion can be 
carried out for the cotmterimag^ of z = 00 as well: for instance, in the AT = 2 example of section 
2 , the point at z = 00 is a simple branch point corresponding to a string of length 2. 

Summarizing, punctures are the sit^ on the embedded Riemann surface, that is on the corre- 
sponding plane curve, where the incoming strings enter and the outgoing strings exit.^ They are 
the counterimages by tt of z = 0 and z = 00 , respectively. If any such point on the plane curve is a 
branch point of multiplicity Z — 1, then the corresponding incoming or outgoing string has length 
L Incidentally, since eventually we want to take the large N limit, we are especially interested in 
the case when Z is comparable with N. In the ordinary treatment of compact Riemann surfaces, 
if these points are regular, they are in no way special and must be considered on the same ground 
as all the other regular points (this can be seen for example by using projective coordinates). In 
our approach, on the contrary, this is not the case. As we have pointed out, the length of an 
incoming or outgoing string is interpreted as the -h component of the momentum in the light-cone 
framework. Therefore the multiplicities of the branch points in the inverse image of z = 0, 00 have 
a precise physical meaning. Two processes that differ by these multiplicities must be kept distinct, 
even if, say, the topological type is the same. 

This is the picture of MST at strong coupling. At finite coupling g the string interpretation of 
instantons persists, but the dressing factor L has the effect of blurring it by smearing the string 
interactions. 

7.2 Some examples 

Before we continue the general discussion of plane curves, let us present some concrete examples 
of cas^ which are not unfamiliar in the physical literature. 

We would like first to describe in detail how the genus zero (tree level) string interactions can 
be reproduced with a suitable form of the coefficients in the spectral equation (32) or (53). In 

^Ie ail the figures below we show the incoming and outgoing strings not as punctures, but as macroscopic strings 
in order to stress their different lengths. 



Y-M Theory and Matrix String Theory 


687 



Figure 3: Tree level process of n to n' strings. 


the genus zero sector any Riemann surface E is a punctured sphere, realized as a NTold branched 
cover of the z-sphere. 

Assume we have n incoming and outgoing strings of lengths k and Z'* , (z = 1 . . . n, j = 1 . . . n'), 
respectively (see the figure). Prom a physical point of view, we have seen that the length of a string 
is interpreted as the + component of its light cone momentum. We recall that the relation 

= (54) 

i j 

must hold due to conservation of the momentum. We have also seen that the length of an incoming 
string i being k > 1 means that the cover has a branch of order — 1 at z = 0, and likewise for 
outgoing strings at z = oo. 

Our aim here is to construct a polynomial P which underlies such a string process. Let us 
tackle this problem by studying the AT-fold covering as a holomorphic projection from S to CP^. 
As we have already noticed, the coordinate z does represent such a projection as a meromorphic 
function on S: punctures manifest themselves as zeroes or poles of appropriate orders Zf, Z'-. The 
condition (54) means in this picture that the number of zeroes minus the number of poles, with 
multiplicity, is zero (this is the degree of the corresponding divisor). 

Proceeding in this direction, we construct the generic meromorphic function in terms of a global 
coordinate on S, which we can take to be y itself. This is a useful simplification, which is not 
possible in higher genus cases. 

The generic meromorphic function satisfying the above requirements on zeroes and poles is 
given by the following rational map: 

2 jy - yi )'" (y ~ yz)'"' ■■■(y- ynY" _ 

(y - y'y (y - y-^Y'^ ■■■{y- yi^'Y’"' 

This map depends on n+n' parameters, in addition to the constant K: it fixes the n+n' punctures 
on S to be located at the points yi and yj. The case of yi or y’j = oo is a limiting case of the above 
formula when the relevant factor is absent. Let us verify that (55) gives the right behaviour at 
z = 0 and z = oo, see [19]. An example will suffice. Near yi we can write z {y - 2/i)^S therefore 
y ^ 2/1 + which is exactly the behaviour considered above. 

Now we can make a first exercise of moduli counting. Let us recall that the moduli space of 
the Riemann sphere with p punctures is p - 3. To count the moduli in (55), we first notice that we 
have n + n'-hl free parameters. Of these, K corresponds to a rescaling of the z coordinate; then we 
can use F5L(2,C) to reabsorb three parameters among the y^, y'-. As a result the meromorphic 
function describes spheres with n 4- n' — 3 moduli, as expected. 

Now, in order to see whether these curves are reproduced within MST, we try to cast (55) in 
the form (33). One sees immediately that (33) corresponds to curves where one of the outgoing 
punctures is at infinity, say y{ = oo. Given that, the above map is indeed of the form of (33) with 
coefficients which are at most linear in z: 

4 ^0 = 0 , 


Ui = aiZ 4- A • 


(56) 



688 


L Bonora 


The generic polynomial of this form corresponds to a curve which has all I, I' = 1, i.e. it has no 
branches at the 2N punctures, and depends on 2N parameters. Of these, three can be ignored, 
since they correspond to transformations that leave = oo: a rescaling of 2 ;; a shift of y and a 
rescaling of y. They are the remnant of PSL{2, C) which keeps y[ = 00 . 

Therefore (33), or (53), contains the right 2N moduli of spheres with punctures. 

The cases when some punctures are branched, are limiting cases of the previous curve when two 
or more punctures coincide. This can be easily seen from the meromorphic map (55). Therefore, 
for each U > 1, we have to enforce - 1 conditions on the parameters auPi of the spectral equation. 
Thus the free parameters are, as expected: 


n n' 

2 ^ - - 1) - -l)-3 = n + n'-3. 

i=l i=l 

We conclude that at genus zero MST reproduces, via (53), the full n + n' — 3 moduli. 

In the case of curves with non-vanishing genus, one would be tempted to proceed in the same 
way, that is to construct the meromorphic projection S CP^ and then invert it. It is rather easy 
to construct the meromorphic function je: corresponding to genus 1. However we come immediately 
across a novel feature which was absent in genus 0, but has dramatic consequences for the moduli 
counting. 

The point is that the punctures on E, represented as zeroes and poles of the meromorphic 
function, cannot be arbitrary. This is a feature of the torus and of higher genus curves. There is 
a condition that they have to satisfy, which is the price we have to pay to be able to represent the 
punctured surface as an algebraic curve. Mathematically speaking, the divisor of a meromorphic 
function is not a generic divisor of degree zero, but is a principal one, which amounts to some extra 
condition on the punctures. The same condition was absent on the Riemann sphere because there 
every divisor of degree zero is principal. 

To see which condition appears, let us represent explicitly the meromorphic function using a 
coordinate t taking values in the fundamental parallelogram. On a torus a meromorphic function 
can be represented as the ratio of products of translated theta functions: 

Now, for z to be single valued, the ti, t'- have to satisfy a condition, that is the vanishing of the 
Abel-Jacobi map: 


mod r. (58) 

i j 

where T is the group of periods, which for the torus is the usual lattice of complex translations: 

S = c/r. 

It is instructive to look at the case of the propagator of a long string at genus one. In this case 
we have li = l[ = iV’, while = 0 for i ^ 1. We have the insertion on the torus of an incoming 
and an outgoing string of length N, at two points. By translation we can bring one of them at the 
origin and the other at, say, = 0 and = L The above condition is in this case: 

iVt = 0 mod r , (59) 

and we see that t has to lie on the lattice T/N indicated in the figure. In this case of the torus with 
two punctures one expects in general two complex moduli. Here we see that we have one complex 
modulus r implicitly contained in the ^-function, plus one discrete modulus t. We can see from 
here that at finite N we have some limitations on the possible diagrams we can realize, however 
as N become large, the lattice T/N fills the plane and we recover the continuous modulus. 



Y-M Theory and Matrix String Theory 


689 



Figure 4: Discrete modulus 

In the next section we will discuss in general the limitations of this kind. Therefore we leave 
this subject at this point and discuss other aspects concerning genus one curves. 

The next thing we would like to do is to mimic the genus 0 case by inverting eq. (57). This 
is certainly possible locally, but, unlike the genus 0 case, we will not find in general a polyno- 
miaJ equation of the type (32). Therefore constructing the meromorphic projection (57) gives us 
only limited information about plane curves. In fact, what one expects is that the plane curve 
corresponding to (57) is in general in a singular representation (see below). 

It is then necessary to study singular plane curves. 

7.3 Plane curves and singular plane curves 

At the beginning of this section we have called plane curves the locus of points which are solution 
of an equation like (53) in C? . This definition is too generic and lends itself to ambiguities. For 
example, we know the coordinates y and z are not on the same footing in MST. A 2 ; rescaling (at 
strong coupling) is a symmetry of any process in MST, but no other PSL{2, Q transformation on 
z is a sjmimetry transformation of a string process {z 1/z is a symmetry transformation of the 
theory, not of a single process). As for y it is not clear which coordinate transformations are a 
symmetry. 

We resolve this and other ambiguities by embedding our curves in CP^: we introduce the 
homogeneous coordinates xo,xi,X 2 with 2 r = xi/xo,y = X 2 fxQ, By multiplying (32) by a suitable 
power of xo we obtain the equation of the curve in CP^ in the form 

F{xo,xi,X2) = 0. (60) 

where F is the polynomial in xo,xi,X 2 determined by P^- 

Then the coordinate transformations that do not change the curve are in general those of 
PGjL( 3,C). However, as we said above, the points z = 0 , 00 should be fixed in MST. This means 
that xo = 0 and xi = 0 should not be modified by any transformation. In conclusion the coordinate 
transformations that give rise to physically indistinguishable processes in MST, are those of the 
subgroup H C PGL(3, Q defined by 

/x'o\ f* 0 0\ fxo\ 

I ar'i j = 0 * 0 I I xi j . (61) 

\^2/ V* * *J V^2/ 

In terms of y and z, these transformations include rescalings of y and z and linear transformations 
y y + az + 0, with complex constants a and 0. They axe acceptable coordinate transformations 
which involve 4 complex parameters. This fits our counting of the independent parameters in the 
previous subsection. 



690 


L Bonora 


From now on, although we keep speaking mostly in terms of y and - 2 ;, we always understand 
the corresponding formulation in terms of xo,xi,X 2 - For example, a transformation like z — > l/z 
must be accompanied hy y ylz in order for us to remain within CP^. The latter is a compact 
space, therefore embedding the curves in it means compactifying them by filling the punctures with 
suitable points in CP^ These points will however always remain distinct due to the particular 
form of the symmetry subgroup Ti we have chosen. 

Given a curve defined by (60), the points in it where all partial first order derivatives vanish are 
singular points. When no singular point is present the curve is smooth. However this can happen 
only when its genus is l)(d - 2), where d is the degree of the curves, i.e. the degree of the 
polynomial P{y,z): in our case d coincides with N. Therefore smooth plane curves cover only 
a very limited subset of the curves we need. One can lower the genus of the plane curve, while 
keeping the degree constant, by allowing for singularities. This means two important things: first, 
for finite N there exists an upper bound 1(1^ ~ l)(iV' - 2) on the genus of the Riemann surfaces 
which define the core of the stringy instantons; second, far from discarding singular curves, as one 
would be tempted to do as a first approach, we have to take them into account, they are bound 
to fill up most of the moduli space of plane curves. As we will see, singular curves are a happy 
occurrence, not a nuisance. 

Singular curves play a major role in MST. For example, eq. (57) above, when written in 
homogeneous coordinates reveals a singularity corresponding to the point z = 00 . Singularities 
can be resolved so as to recover smooth Riemann surfaces (which are not anymore plane curves, 
in general). Given a singular curve S and the set S of its singular points, a resolution of S is 
a smooth surface S, (usually embedded in a larger space than the original curve), together with 
a holomorphic projection m : S — ^ S, such that its restriction m : S — tD~~^(S) — > E — <? is a 
biholomorphism. 

In words, a resolution can be locally acJhieved by replacing a singular point by some space. A 
handy way is to replace the singular point by a sphere - called exceptional divisor. This is the well- 
known procedure of blowing up the singularity. A well-defined algorithm allows us to compute the 
genus of the desingularized curve, see [21]. 

A few examples of plane curves, including singular ones are given in Appendix B. 

7-4 The role of singularities 

It is impossible in general to represent Riemann surfaces by means of smooth plane curves em- 
bedded in the two complex dimensional space spanned by the coordinates y and js:. One can say 
that singular plane curves within stringy instantons are the ordinary tools MST uses in order to 
reproduce the string interaction configurations required by string theory (actually, as will be seen 
in the next section, only in the iV 00 limit is this completely true). 

Far from representing a problem, singular plane curves are most welcome. They come with a 
gratifying bonus: the solution of a serious problem for the identification of MST at strong coupling 
with string theory. This identification is possible if string theory is formulated in the light-cone 
gauge. In MST, (21), ten dimensions enter into the game, two world-sheet dimensions plus eight 
transverse dimensions represented by the (diagonal) At first sight they seem to have a different 
nature, however it is clear that in a light-cone framework the two world-sheet dimensions are to 
be interpreted as representatives of the time and longitudinal dimensions, denoted 0 and 9, which 
bring the total of physical dimensions to ten. Now, stringy instantons characterized by a smooth 
plane curve, extend over four out of these ten dimensions. In other words it would seem that MST 
at strong coupling can only describe four-dimensional string processes. If this were true it would 
be hard to justify the correspondence MST — string theory. 

However singular curves offer a solution to this problem. Singular curves become smooth if 
one enlarge the space where they are embedded. The standard way to resolve a singularity is to 
blow it up (see above), which means that a singular point is replaced by a two-dimensional sphere. 
For example, we have already pointed out that curves in CP^ with nodes (a node is the simplest 
possible type of singularity) only, can be smoothed out by embedding them in CP^, i.e. by adding 



Y-M Theory and Matrix String Theory 


691 


two dimensions. It is natural to interpret this by saying that the corresponding string process 
extend over six (instead of four) dimensions. It is not difficult to imagine processes that extend 
over more (up to ten) dimensions. To better convince ourselves of this fact we can take the reverse 
point of view. Suppose we want to embed these higher (than four) dimensional processes within 
the instantons of the 2D field theory (21). The only possibility is to squeeze (project) them to the 
appropriate four dimensions: such operation of projecting gives rise to singularities. We suggest 
that the true significance of singular plane curves is given by their representing higher (than four) 
dimensional processes. 

7.5 Summary 

In MST Riemann surfaces supporting string interactions make their appearance in the form of plane 
curves. Most of them are singular and need to be desingularized. The coefficients of the algebraic 
equations that define such plane curves are (up to symmetry identification) to be identified with 
moduli. The point is now to see to what extent the moduli of plane curves cover the moduli space 
of Riemann surfaces which are needed in string interaction theory. 

8 The moduli space of MST 

Above we have seen that the genus of plane curves in MST is limited, at finite by an upper 
bound. There is another limitation to our capability to represent plane curves within MST which 
comes from the presence of punctures. In fact the presence of punctures on the Riemann surface 
entails the consequence that the moduli space of plane curves of genus h with n punctures is a 
discretized version of the the moduli space of genus h Riemann surfaces with n punctureSy whose 
complex dimension is 3^ - 3 -f- n. A good parametrization of the moduli space fit for string 
interaction theory is provided by Mandelstam’s variables, [40, 39]. By making a comparison with 
Mandelstam’s parametrization, we will find out that h of the Mandelstam complex parameters are 
actually discrete for the plane curves that appear in MST. 

The origin of such discretization can be briefly described as follows. The coordinate z we have 
introduced above, can be naturally regarded as a meromorphic function on a given plane curve 
(it is a realization of the projection tt : E C). The counterimages of z = 0 and z = oo form a 
principal divisor in S. This entails, by Abel’s theorem, h discretizing conditions on the parameters 
describing the plane curve. A detailed analysis shows that this imposes h of the Mandelstam 
parameters to be discrete. 

However when iV oo these discrete parameters become continuous andy in addition, the upper 
limit on the genus we mentioned above becomes ineffective. Therefore for large N MST recovers 
the full moduli space of string theory. We recall that, for finite N, also the p'^ components of the 
momenta of the incoming outgoing strings are discrete, so that again continuity is recovered only 
for iV -> 00 . Therefore, a complete description of string interaction theory can be truly achieved by 
MST only in the large N limit It is nevertheless remarkable that genus 0 processes (with discrete 

components of the external momenta) are exactly described by MST also for finite N. 

A very convenient way to proceed is to make a comparison with the Mandelstam parametriza^ 
tion of the moduli space of Riemann surfaces with pimctures [40]. To this end, let us first review 
some basic facts about the realization of Mandelstam diagrams. We refer to [39] for a complete 
account of the following very quick review, after which, we will examine the consequence of the 
main new input from MST, that is holomorphicity of the covering map which defines the Mandel- 
stam diagram. The result will be a set of constraints on the kinematical data of the diagram which 
turn out to be a quantization condition for some of the Mandelstam parameters. In the large N 
limit these constraints loosen their effectiveness and allow us to recover the full moduli space of 
the string diagrams. 

Let S be a compact Riemann surface of genus h and let a;/, I = 1, . . . , h be a set of holomorphic 
differentials on E normalized by wj = 5/j, while wj = flu is the period matrix. We fix n 



692 


L Bonora 


pimctures {Qi, . . . , Qn} on E and define the divisor D = Qi • . . . • Qn- We also introduce a set of 
n real numbers i? = {ri, . . . , fn} such that u = 0. 

Now, let ixj be the differential which is holomorphic on S \ D with simple poles at D with 
resQiO; = r* and Re cj = 0 = Re uj. 

In [39] it was shown how the above differential defines a nice procedure which allows us to look 
at S as a topological covering of a cylinder: one can easily decompose S into pants along the level 
lines of the function r(P) = Re In this sense, co induces on S the structure of a Mandelstam 
diagram. The Mandelstam parameters are the twist-angles 9b, b = 1, . . . 3h + n - 3, along the 
junctures of the pants decomposition and the relative time coordinates — tq, a = 1, . . . , 2/i-hn-3, 
of the 2/i+n~2 interaction points, h additional real parameters are the internal light-cone momenta 
= f uj. Altogether they form a set of 6h ~ 6 + 2n real parameters. In [39] it was shown that 
these parameters represent good coordinates on the moduli space of genus h Riemann surfaces 
with n punctures, 

To complete the picture we identify the set R with the -f components of the external light- 
cone momenta of the diagram, i.e. the periods of u around the punctures. We also have the 
relations where are integer- valued matrices which depends on the pants 

decomposition of the Riemann surface and its intersections with the a and 15 cycles. 

Our strategy now is the following. We first construct an explicit form for u;, in terms of the 
prime-form, the wj’s and the period matrix of S. Then we compare this u with the one that 
comes from MST. The relevant new input consists in the fact that MST induces on S the structure 
of a. holomorphic covering of the Riemann sphere (as usual we consider the latter instead of the 
cylinder). By this we mean that, if z : S CP^ is the covering map in the MST scheme, the 
coordinate z is a, meromorphic function on S. The role of a; in MST is played by dln^:, therefore 
we have to identify them. This condition becomes a constraint on the data of the Mandelstam 
diagram. In fact, it means that = Qp • . . . • QJ® , being the divisor of the meromorphic function * 
z, is a principal divisor on E, in particular r* € Z. As a consequence, some constraints appear in 
the data of the Mandelstam diagram and these conditions induce a complex codimension h slicing 
of the moduli space. This can be seen as follows. 

Let be the holomorphic differential on E\{P+, P- } with simple poles at P± with residues 
±l and imaginary periods. It can be written as 


(P) = d{p)ln 


’ E{P,P^) 

E(P,P_) 


^2TtIm /p+ UJ 


= d(P)lnP(P,P+,P^), 


(62) 


where E{P,Q) is the prime form on E, is the imaginary part of the period matrix and 
d(P) = dP - 


In terms of the above differentials we can write 


n— 1 

Z=1 

where ki - ki^i = Vi and ko = 0 = kn; substituting (62) into (63) we obtain 


(63) 


u{P) = d(p)lnz(P) 

where 


n— 1 


Z=1 


(64) 


Now, as anticipated above, we make the identification w = dln^^. This requires that z = z up 
to a multiplicative constant, which implies that z is a well defined meromorphic function on S. 
On the one hand this impose that the residues n be quantized in integer values. On the other 



Y“M Theory and Matrix String Theory 


693 


hand it requires that the differential dz have vanishing periods along a and /? cycles. The latter 
condition is fulfilled iff 

uj =:mj + njQ.jj (65) 

1=1 

for some m/, n/ € Z. At this point the situation is clear: (65) is the vanishing condition for the 
Abel map and says that the divisor is principal. 

Conversely, let 2 : be a meromorphic function on E and its divisor. By definition (65) holds 
and resQid(^p^\nz = r/. 

Notice that the periods of u are quantized in integral values as 

<p u = 2mnj and (p u = —2'Kimi , (66) 

Joti J Pi 

and this condition is equivalent to (65). 

Eq. (66) means that the internal light^cone momenta of the diagram are quantized and that, 
in addition, there are h discretizing constraints on the twist-angles of the Mandelstam diagram. 
Since these variables, together with the relative interaction times which have been left untouched, 
are the coordinates of the moduli space, we are left with a discrete slicing of the moduli space 
Mh,n^ e^ch slice being of complex dimension 2^ - 3 + n. This discretized moduli space is what we 
have called in section 5.4. 

One can verify that in all the genus one examples we have considered in section 7, the counting 
of independent parameters matches the formula 2h + n ~ 3. We believe that, for any topological 
type (/i, n), one can construct plane curves with 2/i + n — 3 independent parameters. 

A confirmation of this result comes from an estimate of the moduli space of stringy instantons. 
Since in the Y factor there no free parameters, the moduli space of stringy instantons must coincide 
with the firee parameters contained in M, i.e. with the moduli space of plane curves. The estimate 
carried out in [21] confirms the above evaluation of the continuous dimension of the moduli space 
of the latter. 

In the large N limit, however, the quantization condition disappears in a continuum of values. 

^ [z'‘ 0 nz*] = o'* . (67) 

Simultaneously, for large N also the bound |(iV' - l)(iV' - 2) on the genus of the plane curves in 
MST, becomes ineffective- It is therefore sensible to argue that for large N one recovers the full 
moduli space of string theory. 

8.1 Summary 

The moduli space of Riemannian instantons which appear in MST, is only an approximated version 
of the moduli space of Riemann surfaces which appear in string interaction theory. In particular h 
among the former are a discrete version of h among the latter. It is however reasonable to assume 
that in the large N limit the two spaces tend to coincide. 

9 Comments 

We have seen that in MST Riemann surfaces are generated as classical solutions of the equations 
of motion. More precisely they come dressed by a factor that tends to 1 in the strong coupling 
limit. Therefore, in this limit, we are left with pure Riemann surfaces with punctures, which can be 
thought of as carriers of a string interactions. We have seen that this leads to a consistent picture; 
the strong coupling limit action of MST is the Green-Schwarz action of IIA superstring theory plus 
a ftree Maxwell action; the latter guarantees that the path integral for a string interaction process 



694 


L Bonora 


in strong coupling MST has the correct form; when i\r — oo, the amplitudes computed in string 
interaction theory and in strong coupling MST tend to coincide. It is therefore legitimate to claim 
that the strong coupling MST represents type IIA superstring theory. Recently new results have 
been found in this field, [45, 22]. In particular in [22] it has been shown that the strong coupling 
limit of Heterotic Matrix String Theory (a variant of MST with gauge group SO{N)) describes 
the heterotic superstring theory. 

MST is therefore a remarkable case of a Yang~Mills theory with a definite string interpretation. 
It is however clear from section 2 that this is not the only interesting case. Riemmannian instantons 
exist for example also in a 4d Yang-Mills theory. They lend themselves to a string interpretation 
of (some limit of) Yang-Mills theory. In any case they represent a stimulating possibility which 
has not been exploited so far. 


10 Appendices 


10.1 Appendix A. Analysis of the sinh-Gordon equation 

We discuss here an (approjdmate) analytic approach to the sinh-Gordon equation (14) with bound- 

ary conditions (13) and vanishing at z = 0 ,oo. 

Recalling that 


dz 


= yflg 




the approximate expression of C in terms of z is 


C ~ for 2 ~ zo 

C ~ for Ni » Nol 

^ ~ y/2gy/-ZQ In z, for |z( << |zo| . 

If these were the exact expressions for we could consider spherically symmetric solutions of (14), 
i.e. solutions depending only on r = 1C|. For them eq.(14) takes the form 

d^u + -drU = 4sinhu ( 68 ) 

This is a form of the Painleve III equation. The general form of the solutions of this equation 
are known, see [34] and references therein. Let us select the class of solutions with the following 
asymptotic behaviour: 


u(r) ~ a in r 4- 1, r “)• 0, |q:| < 2 (69) 

u{r) ^ r oo . (70) 

The constants 0 and 7 must be fine-tuned to a in order to give rise to smooth solutions. However, 
in our case, we are not interested in the actual value of 0 and 7 , therefore we can always adjust 
the parameters in such a way as to have a smooth solution. As for the bound \a\ < 2 , in our case 

u - In r 
o 

therefore the bound is satisfied. 

I^t us study now the properties of the solution. In the various regions we have the following 
asymptotic expressions: 


Z ^ Zq 



Y-M Theory and Matrix String Theory 


695 




Figure 5: Self-energy of a string. 


for |z|»|zo| (71) 

r ~ y/2gy/\^\\n\z\, for ,|zi«|zo|- 

From this we see that, when, at fixed finite z is near the origin and far to infinity, the solution 
tends to zero. The convergence to zero is more rapid the larger g is, the slope in g being of negative 
exponential type. Looking now at the first equation, (71), we see that, even if we sit near 2 o, we 
may still fall in the regime (r large) in which the solution is extremely small, provided g is large 
enough. In other words, for large g the solution shrinks around zq^ and, in the g oo limit it 
becomes spike-like with support at ^ = zo- We can say that, if we exclude a neighborhood of zq 
of size proportional to the solution decreases to zero more rapidly than any power o^l/g. 

We recall that the spherically symmetric solution is not the exact solution, but only an approx- 
imate one. However we expect the general behaviour of the true solution to be essentially similar, 
i.e. that it shrinks very rapidly around zq bs g oo. 

One can easily extend the previous analysis to the case in which a contains several distinct 
zeroes. Simply find suitable approximate expressions for near the zeroes of a and apply the 
previous approximate analysis. The conclusion will be that the solution shrinks very quickly 
around the branch points as ^ oo. 

10.2 Appendix B. Smooth and singular plane curves 

This Appendix is devoted to some explicit examples of smooth and singular plane curves. 

A useful tool in studying plane curves is the Newton Polygon, Let us consider the polynomial 
P{y^z) in (53). We associate to each monomial in it a point p = a, g = ^ in a p, g plane. 
We obtain a set of points called the carrier, its convex hull is by definition the Newton polygon 
associated to the curve. Prom the Newton polygon one can deduce a lot of information concerning 
the curve. For the curves we consider the Newton polygon always contains the point (p = 0, g = iV) 
and is contained in the equilateral triangle formed by the p and g-axis and by the line p -h g = iV. 

We start with the case iV = 3,p = 1, for which there is already a good variety of examples. 
These have the advantage that one can check the results by explicitly solving the cubic algebraic 
equation by means of Cardano’s formula. We do not write down the algebraic equations, but simply 
the corresponding polygons. In the following figures the Newton polygon is the one delimited by 
the dashed lines. The coefiicient of the monomials within or on the border of the Newton polygon 
are understood to be generic, unless otherwise specified. 

The simplest process one can imagine is the string self-energy. This means that we have to 
look for a totally branched curve over z = 0 and z = oo. Remember that the polynomials giving 




696 


L Bonora 



Figure 6: Shrinking cycles: totally branched quartics at genus two and one. 


the solutions over these points are given by the points of the carrier on the q = 0 and p + q = N 
lines respectively. So one simple solution is given by the carrier shown in figure 5; the generic case 
will be non-singular also at finite z and so the genus will be one. The presence of the points (1, 0) 
and (2,0) ensures the nonsingularity of 0 and oo; the local behaviour around them is given by the 
upper side of the Newton polygon. 

For other smooth examples, see [21]. Let us consider now examples of singular curves. Given 
an algebraic equation, a singularity may appear for some particular choice of the parameters. In 
this case one has simply to replace the finite hole in these figures by a hole shrunk to a point (for 
example see fig. 6); the curve becomes genus zero, i.e. a sphere with two identified points. This 
singularity is the simplest one, it is characterized by a non- vanishing Hessian and is called a node. 
All nodes can be viewed as two points identified: blowing up a node amounts to separating the 
pomts. For instance, consider the previous case (figure 5): the pol 5 momial which corresponds to 
the diagram can be written as 

p + czy + z{z - a) . (72) 

Imposing that a point be singular, one finds that a necessary (but not sufficient) condition is 
that its discriminant, 6 = z^[27{z - a)^ - 4(^z]^ have a multiple root. The double root at z = 0 
just signals that this point is another triple branch point, as we already know; imposing that the 
remaining factor be a square, one finds several values, of which for instance c = 0 gives a triple 
branch at z = a and no singular point, and c = gives instead a node. 

This introduces us to our next task: to show how it is possible to describe low-genus highly 
branched curves. We will describe in detail the self-energy case. We take N = 4; since we want 
total branching, we can choose a diagram like that in figure 6. The corresponding polynomial 
has the coefficients corresponding to the vertices of the polygon, and can also have coefficients 
corresponding to the points on the sides or in the interior (by the way the latter are always 
{N — 1){N — 2)/2 in number, if there is no singular point at 0 and at oo, and count the genus of 
the corresponding smooth curve). Now we can look for singular cases in this family along the lines 
of the previous example; since already in this case computations become complicated, we restrict 
ourselv^ to the biquadratic case. In other words the polynomial we start with is 

P = y* + bzy^ + z(d + ez-\- fz"^) ; (73) 

its discriminant is 

5 = 16z®(d + ez + fz^){4d + 4ez + 4fz^ - iPz)^ . (74) 

As before, the term shows that the branching at z = 0 is of order three, i.e. four sheets meet 
there. The other two terms mean the following. Solutions of a biquadratic equation are in general 
±yi, 2 - Its discriminant can vanish in two cases: if yi = y^ or yi = -y^ — this is determined by 
the third term in (74) — in which case, at the corresponding value of z, there is a couple of double 
branch points; if yi = 0 or ya = 0, which is determined by the second term, there is a single node. 




Y-M Theory and Matrix String Theory 


697 


If we choose the coefficients so that the third term is a fourth power, we have two nodes, and so 
genus one; if, instead, the coefficients are chosen so that the second term is a square, we have a 
single node, and so genus two. The situation is shown in figure 6. 

Acknowledgments 

This review is based on joint work done with G.Bonelli, F.Nesti and A.Tomasiello, whom 
I would like to thank for their collaboration. This work was partially supported by EC TMR 
Programme, grant FMRX-CT96-0012, and by the Italian MURST for the program “Fisica Teorica 
delle Interazioni Fondamentali” . 


References 

[1] G. ’t Hooft, A Planar Diagram Theory for Strong Interactions, Nucl.Phys. B72 (1974) 461; 
A Two-Dimensional Model for Mesons, Nucl.Phys. B75 (1974) 461. 

[2] see for example: D.J.Gross and W.Taylor, Two-dimensional QCD and strings, [HEP- 
TH9311072]. 

[3] see for example: E.Brezin and S.Wadia The large N expansion in quantum field theory and 
statistical physics: from spin systems to 2-dimensional gravity. World Scientific, Singapore, 
1993. 

[4] J.Maldacena, The large N limit of superconformal field theories and supergravity, 
Adv.Theor.Math.Phys. 2 (1998) 231, [HEP-TH9711200]. 

[5] A.Polyakov, The Wall of the Cave, [HEP-TH9809057] 

[6] I.R.Klebanov and A.A.Tseytlin, D-branes and dual Gauge Theories in Type 0 Strings, [HEP- 
TH9811035]. 

[7] M.B. Green, J.H. Schwarz, E. Witten, Superstring Theory, Cambridge Univ. Press, Cambridge 
1987. 

[8] J. Polchinski, String Theory, Cambridge Univ. Press, Cambridge 1998. 

[9] T. Banks, W. Fischler, S.H. Shenker and L. Susskind, M Theory As A Matrix Model: A 
Conjecture, Phys.Rev.D 55 (1997) 5112 [HEP-TH9610043]. 

[10] L. Motl, Proposals on Nonperturbative Superstring Interactions, [HEP-TH9701025]- 

[11] T. Banks and N. Seiberg, Strings from Matrices, Nucl.Phys. B 497 (1997) 41 [HEP- 
TH9702187]. 

[12] W. Taylor, D-brane Field Theory on Compact Spaces, Phys.Lett. B394 (1997) 283 [HEP- 
TH9611042]. 

[13] R. Dijkgraaf, E. Verlinde, H. Verlinde, Matrix String Theory, Nucl.Phys. B500 (1997) 43 
[HEP-TH9703030]. 

[14] R. Dijkgraaf, G. Moore, E. Verlinde, H. Verlinde, Elliptic Genera of Symmetric Products and 
Second Quantized Strings, Comm. Math.Phys. 185 (1997) 197 [HEP-TH9608096]. 

[15] H. Verlinde, A Matrix String Interpretation of the Large N Loop Equation, [HEP-TH9705029]. 

[16] L. Bonora, C.S. Chu, On the String Interpretation of M(atrix) Theory, Phys.Lett. B410 (1997) 
142 [HEP-TH9705137]. 



698 


L Bonora 


[17] T. Wynter, Gauge fields and interactions in matrix string theory Phys.Lett. B415 (1997) 349 
[HEP-TH9709029]. 

[18] S.B. Giddings, F. Hacquebord, H. Verlinde, High Energy Scattering of D-pair Creation in 
Matrix String Theory NucLPhys. B537 (1999) 260 [HEP-TH9804121]. 

[19] G. Bonelli, L. Bonora and F. Nesti, Matrix string theory j 2D instantons and affine Toda field 
theory, Phys.Lett. B435 (1998) 303 [HEP-TH9805071]. 

[20] G. Bonelli, L. Bonora and F. Nesti, String Interactions from Matrix String Theory, NucLPhys. 
B538 (1999) 100 [HEP-TH9807232]. 

[21] G. Bonelli, L. Bonora, F. Nesti and A.Tomasiello, Matrix String Theory and its Moduli space 
[HEP-TH9901093], to be published in Nucl.Phys.B. 

[22] G. Bonelli, L. Bonora, F. Nesti and A.Tomasiello, Heterotic Matrix String Theory and Rie- 
mann Surfaces [HEP-TH9905092] 

[23] L. Susskind, Another Conjecture about M(atrix) Theory, [HEP-TH9704080]. 

[24] A. Bilal, M(atrix) theory: a pedagogical introduction^ [HEP-TH9710136]. 

[25] T. Banks , Matrix Theory, [HEP-TH9710231]. 

[26] D. Bigatti and L. Susskind, Review of Matrix Theory, [HEP-TH9712072]. 

[27] Washington Taylor IV, Lectures on D-branes, Gauge Theory and M(atrices), [HEP- 
TH9801182]. 

[28] N.J. Hitchin, The self-duality equations on a Riemann surface, Proc. London Math. Soc. 55 
(1987) 59; Lie groups and Teichmuller space, Topology 31 (1992) 449. 

[29] N, Hitchin, Stable bundles and integrable systems, Duke Math. Jour. 54 (1987) 91. 

[30] E. Markman, Spectral curves and integrable systems. Comp. Math. 94 (1994) 255. 

[31] C.T. Simpson, Harmonic bundles on noncompact curves, Jour.Am.Math.Soc. 3 (1990) 713. 

[32] R- Donagi and E. Witten, Supersymmetric Yang-Mills theory and integrable systems, 
Nucl.Phys. B460 (1996) 299 [HEP-TH9510101]. 

[33] M. BocMccMo The large-N limit of QCD and the collective field of the Hitchin fibration, JHEP 
01 (1999) 006 [HEP-TH9810015]. 

[34] A.R. Its and V.Yu. Novokshenov, The isomonodromic deformation method in the theory of 
Painleve equations, Lect. Notes Math. 1191, Springer-Verlag 1986. 

[35] E. Gava, J.F. Morales, K.S. Narain, G. Thompson, Bound States of Type I D-Strings [HEP- 
TH9801128] 

[36] E. Witten, On S-duality in abelian gauge theory [hepth/9505186]. 

[37] L.L, Ahlfors, Open Riemann surfaces and extremal problems on compact subregions, Comm. 
Math. Helv. 24 {1950} 100. 

J.D. Fay, Theta functions on Riemann surfaces, Lect.Not.Math. VoL352, Springer-Verlag, 
Berlin 1973. 

[38] J.-L. Gervais and B. Sakita, Expended particles in quantum field theories, 
Phys.Rev.D 1 1 ( 1975)2943. 



Y-M Theory and Matrix String Theory 


699 


[39] S.B. Giddings and S.A. Wolpert, A triangulation of moduli space from light cone string theory 
Comm.Math.Phys.109 (1987) 177. 

[40] S. Mandelstam, Dual resonance models, Phys.Rep. 13 (1974) 259. 

[41] E. Brieskorn and H. Knorrer, Plane Algebraic Curves, Birkhauser Verlag, Basel 1986. 

[42] P. Griffiths and J. Harris, Principles of Algebraic Geometry New York 1978. 

[43] F. Kirwan, Complex algebraic curves Cambridge 1992. 

[44] I.M.Gelfand, M.M. Kapranov, A.V.Zelevinsky, Discriminants, resultants, and multidimen- 
sional determinants, Boston, Birkhauser, 1994. 

[45] T.Wynter, High energy scattering amplitudes in matrix string theory, [HEP-TH9905087]. 




Part E : QFT In 2 + 1 Dimensions 

26. Fractional Statistics And Chern-Simons Field Theory In 2 + 1 Dimensions 
by Avinash Khare 

27. Chern Simons Field And Composite Bosons In The Quantum Hall System 
by R.Rajaraman 




26 . Fractional Statistics and Chern- Simons Field 


Avinash Khare 

Institute of Physics, Sachivalaya Marg, Bhubaneswar 751005, India 


Abstract 

The question of anyons and fractional statistics in field theories in 2+1 dimensions with 
Chern-Simons (CS) term is discussed in some detail. Arguments are spelled out as to why 
fractional statistics is only possible in two space dimensions. This phenomenon is most nat- 
urally discussed within the framework of field theories with CS term, hence as a prelude to 
this discussion I first discuss the various properties of the CS term. In particular its role as a 
gauge field mass term is emphasized. In the presence of the CS term, anyons can appear in 
two different ways i.e. either as soliton of the corresponding field theory or as a fundamental 
quanta carrying fractional statistics and both approaches are elaborated in some detail. 


1 Introduction 

Many of us have wondered some time or the other if one can have nontrivial science and technology 
in two space dimensions; but the usual feeling is that two space dimensions do not offer enough 
scope for it. This question, to the best of my knowledge, was first addressed in 1884 by E.A. Abbot 
in his satirical novel Flatland [1]. The first serious book on this topic appeared in 1907 entitled 
An episode of Flatland [2]. In this book C.H. Hinton offered glimpses of the possible science and 
technology in the flatland. A nice summary of these two books appeared as a chapter entitled 
Flatland in a book in 1969 edited by Martin Gardner [3]. Inspired by this summary, in 1979 
A.K. Dewdney [4] published a book which contains several laws of physics, chemistry, astronomy 
and biology in the flatland. However, all these people missed one important case where physical 
laws are much more complex, nontrivial and hence interesting in the flatland than in our three 
dimensional world. I am referring here to the case of quantum statistics. In last two decades it 
has been realized that whereas in three and higher space dimensions all particles must either be 
bosons or fermions (i.e. they must have spin of nh or (2n+ l)fi/2 with n=0,l,2,... and must obey 
Bose-Einstein or Fermi-Dirac statistics respectively), in two space dimensions the particles can 
have any fractional spin and can satisfy any fractional statistics which is interpolating between 
the two. The particles obeying such statistics are generically called as anyons [5]. In other words, 
if one takes one anyon slowly around the other then in general the phase acquired is exp{±id). If 
6 =0 or TT (modulo 27r) then the particles are bosons or fermions respectively while if 0 < ^ < tt 
then the particles are termed as anyons. 

Prom our experience with fermions and bosons it is well known that the question of spin and 
statistics can be properly handled only within the formalism of relativistic quantum field theory. 
Thus it is of interest to enquire if one can also understand the ideas of anyons and fractional 
statistics within the formalism of relativistic quantum field theory. This is the issue that we would 
like to discuss in this article. 

Before I go into the details, one might wonder if our discussion is merely of academic interest? 
The answer to the question is no. In fact it is a surprising fact that two, one and even zero 
dimensional experimental physics is possible in our three-dimensional world. This is because of 
the third law of thermodynamics, which states that all the degrees of freedom freeze out in the 
limit of zero temperature, it is possible to strictly confine the electrons to surfaces, or even to 


Email:khare@iopb.res .in 



Fractional Statistics and CS Field 


703 


lines or points. Thus it may happen that in a strongly confining potential, or at sufficiently low 
temperatures, the excitation energy in one or more directions may be much higher than the average 
thermal energy of the particles, so that those dimensions axe effectively frozen out. Of course, even 
then, at the basic level, the fundamental particles are certainly fermions or bosons. However, 
the most direct and appropriate discussion of the low energy behavior of a material is usually in 
terms of the quasi-particles. The hope is that at least in some of these cases the quasi-particles 
could be anyons. This hope has in fact been realized in the case of the fractionally quantized 
Hall effect where the quasi-particles are believed to be charged vortices i.e. charged anyons [6]. 
Recent experiments [7] seem to confirm the existence of fractionally charged excitations and hence 
indirectly of anyons. 

The plan of the article is the following. In Sec.II, I first spell out as to why fractional statistics 
is only possible in two space dimensions. It turns out that the phenomenon of fractional statistics 
is most naturally discussed within the framework of field theories with CS term. As a prelude to 
this discussion, in Sec.III, I discuss the various properties of the CS term. In particular its role 
as a gauge field mass term and its behavior under the discrete transformations of parity (P) and 
time-reversal (T) is emphasized. In the presence of the CS term, anyons can appear in two different 
ways (i.e. either as soliton of the corresponding field theory or as fundamental quanta carrying 
fractional statistics) and both approaches are elaborated in some detail in the next three sections. 
The charged vortex solutions in Abelian Higgs model with CS term are obtained in Sec.IV, and it is 
pointed out that these charged vortices represent the first relativistic model for (extended) charged 
anyons. I also construct the charged vortex solutions in pure CS theory in both the relativistic 
and the non-relativistic settings. In Sec.V, I discuss an example of neutral relativistic anyons by 
considering the soliton solutions in the CP^ model with the Hopf term which is one of the avtars 
of the CS term. Finally, in Sec.VI, I elaborate upon the other approach in which fundamental 
fields of theories with CS term themselves carry fractional spin and obey fractional statistics. 

2 Why Anyons in Only Two Dimensions? 

Before we come to the question of fractional statistics, it might be worthwhile to understand as to 
why unlike in three and higher space dimensions, the eigenvalue of the spin angular momentum 
operator can take any fractional value in units of h. The point is that the spin in two dimensions 
differs fundamentally from the spin in higher dimensions. This is because whereas in three and 
higher space dimensions, the spin angular momentum algebra is non-commutative i.e. 

= i^ijkSk j ~ Ij^, 3 (1) 

in two space dimensions, it is a trivial commutative algebra since only one generator (say 83 ) is 
available which obviously commutes with itself. As a result, there is no analogue of the quantization 
of the angular momentum, which arises in three and higher space dimensions from the nonlinear 
commutation relation (1). Here Sijk is the completely antisymmetric tensor. 

Now, in relativistic quantum field theory, there is a deep and profound connection between 
the spin and the statistics i.e. particles with half integer spin are fermions, satisfying Fermi-Dirac 
statistics, while those with integer spin are bosons, satisfying Bose-Einstein statistics. This imme- 
diately suggests that in two dimensions the particles may exhibit fractional (i.e. any) statistics. In 
a remarkable paper Leinaas and Myrheim [8] showed that this is indeed so. Before we come to a 
proper discussion about the statistics, it is worth clarifying as to what exactly one means by quan- 
tum statistics. In most text books on statistical mechanics, the term “quantum statistics” refers to 
the phase picked up by a wave function when two identical particles are interchanged, i.e, under the 
permutation of the particles. But this is slightly misleading and has been correctly criticized in the 
literature [9], If the particles are strictly identical, the word permutation has no physical meaning 
since a given configuration and the one obtained by the permutation of the particle coordinates 
are merely two different ways of describing the same particle configuration. The term quantum 
statistics actually refers to the phase that arises when two particles are adiabatically transported 
giving rise to the exchange. In this book, we shall be concentrating on this definition of quantum 



704 


Avinash Khare 


statistics. It is a coincidence that in three and higher dimensions, the two definitions, based on 
the permutation and the adiabatic exchange of two particles, coincide, but in two dimensions the 
two definitions give very different answers. 

The key reason for the fractional statistics in two dimensions is the principle of indistinguisha- 
bility of identical particles. It is one of the most important characteristics of quantum mechanics 
(vis a vis classical mechanics) and it has profound physical consequences. The principle is in fact 
older than quantum mechanics. It was introduced by John Willard Gibbs even in classical statis- 
tical mechanics to resolve the famous Gibbs paradox. Even though this principle has been with 
us for a very long time, unfortunately, its full significance was not appreciated till 1977 and that 
is how one missed the possibility of fractional statistics in two dimensions for all these years. 

Following Leinaas and Myrheim [8], let us enquire about the configuration space of a system 
of identical particles ? Normally one considers the full phase space in statistical mechanics but it 
turns out that configuration space is enough for this discussion. Suppose one particle space is X. 
Then what is the configuration space of N identical particles ? The Naive answer is which, 
even though true locally, is not correct globally. Why? The reason is, since the particles are strictly 
identical, hence there is no distinction between the points in that differ only in the ordering 
of the particle coordinates. For example, consider the point 

X = (Xi,X 2 ,..-,Xjv) (2) 

in X^ where x^ E X for 2 = 1,2, Now consider another point x' in X^ which is obtained 
from x by the permutation p of the particle indices i.e. 

x' = P(x) = (xp^i(i),...,xp-i(;v)) . (3) 

Clearly, both describe the same physical configuration of the system. Thus the true configuration of 
the iV-particle system is not X^ but it is the space X^ fS^ which is obtained by identifying points 
in X^ that represent the same physical configuration, i.e. it is obtained from X^ by dividing out 
by the action of the symmetry group Sjv* Note that 5iv is a discrete, finite group obtained by 
permutation of N identical particles. As a result, the space X^/Sjsr is locally isomorphic to X^ 
except at its singular points. However, the global properties of the two spaces are very different. 
Whereas X^ has only regular points when X is regular, those points in X^[Sn which correspond 
to a coincidence of the positions of two or more particles are in fact singular points of X^/Sn. Thus 
to calculate the configuration space of identical particles, such singular points must be excluded 
by say hard-core constraint so that we can determine if two particles have been exchanged or not. 
This of course does not make much difference classically. However, in the quantum case the global 
properties of the configuration space are of deep significance and this results in the possibility of 
fractional statistics. It is worth emphasizing that this is the crux of the whole matter and it is this 
fact which was missed for about fifty years! 

It turns out that the removal of such singular point in two space dimensions makes the space 
multiply connected while for three and higher space dimensions it is still doubly connected. That 
is why, in two dimensions it is possible to define paths that wind around the origin an arbitrary 
number of times counted with orientation. As a consequence, when one quantizes a system of 
identical particles then one can show that in two dimensions it is possible to consistently assign 
any value to the phase arising due to the exchange of two identical particles. Since in two dimensions 
one can distinguish the clockwise winding from the anti-clockwise winding, hence without any loss 
of generality one can assign the phases and respectively, in the case of the anti-clockwise 
and the clockwise windings. 

At this point, it may be worthwhile to mention few key properties of anyons. 

1. Anyons must necessarily violate the discrete symmetries of parity (P) and time reversal (T) 
if 0 < 0 < TT since the clockwise and the anti-clockwise windings have different phase factors. 

2. Anyons are sort of in between the bosons and the fermions i.e. the repulsion between two 
anyons in the ground state monotonically increases as 6 goes from 0 to tt with there being 
no repulsion between two bosons. Thus, in a sense, anyons are closer to the fermions than 
to the bosons since all of them will satisfy a generalized form of Pauli exclusion principle. 



Fractional Statistics and CS Field 


705 


3. It turns out that whereas the permutation group which is at the heart of the Bose-Einstein and 
the Fermi-Dirac statistics, it is the braid group which is at the heart of the fractional statistics. 
In particular, whereas there are two one dimensional representations of the permutation 
group (the identical one and the alternating one, corresponding to the Bose-Einstein and 
Fermi-Dirac statistics respectively), the braid group admits a continuous parameter family 
of one dimensional representations which one usually identifies with the parameter 6 which 
characterizes fractional statistics. 

4. Is there a relation between the anyonic statistics and the parastatistics ? The answer is no. 
They are built on two different structures i.e. whereas the Parastatistics corresponds to the 
higher dimensional representation of the permutation group while anyons correspond to the 
one dimensional representation of the braid group. 

2.1 Quantum Statistics in One Dimension 

Since we have been talking about the possible quantiun statistics in various dimensions, hence it 
may be worthwhile to also talk about the various possibilities in one dimension. Recall that the 
notion of the spin does not exist in one dimension since there is no axis to rotate about in that 
case. Similarly the concept of the quantum statistics is not uniquely defined in one dimension since 
the position of two particles cannot be interchanged without their passing through one another. 
As a result, the intrinsic statistics is inextricably mixed up with the local interactions. In fact 
this ambiguity is at the heart of the bosonization technique which allows the same particle to be 
represented alternatively by a boson or a fermion field. If, however, statistics is defined in terms of 
the exclusion principle rather than the exchange of identical particles, then it is possible to define 
quantum statistics in even one dimension [10]. 

3 Introduction to Chern-Simons Term 

We now want to understand how anyons occur in field theory. It turns out that this is possible 
provided the CS term or its incarnation, the Hopf term are present. It may therefore be worthwhile 
to first introduce the CS term (in 24-1 dimensions) and discuss its various properties [11]. 

3.1 What is Chern-Simons Term? 

Consider the Lagrangian density for classical electrodynamics in 3+1 dimensions as given by 

£ = - -m)ip (4) 

where Ff^j^ = ~ di^Aj^ and ^ — ieAf^ is the covariant derivative. This Lagrangian is 

invariant under the local gauge transformation 

rp{x) . Ay,{x) -4 A^{x) + d^a{x) . (5) 

Similarly, for massless fermions {m=0), this Lagrangian is also invariant under the (global) chiral 
transformation 

^( 2 :) -> e^'^^^i>{x) , Af,(x) -4 ^^(a:) . (6) 

The naive expectation was that, these two symmetries i.e. the gauge and the chiral symmetries, 
which are valid at the classical level, will continue to hold good even in the quantum theory. 
As a ransequence, one expected that the vector and the axial vector currents and 

j'l = which are conserved at the classical level, will continue to remain conserved even in 

the quantum theory. It has however, been shown that this is not so. There is no regularization 
which can simultaneously preserve both these symmetries at the quantum level. Because of the 
unexpected result, it was called an anomaly at that time (and unfortunately even today it is called 
so), even though the correct name should have been quantum mechanical symmetry breaking. 



706 


Avinash Khare 


Remarkably, the entire effect comes only from one loop diagram and two and higher loops do not 
contribute to the anomaly. In view of our strong faith in the gauge symmetry, one therefore says 
that it is the chiral symmetry which is broken by the one loop quantum corrections. In particular, 
there is a gauge singlet (axial) anomaly in any even dimension, (2n) so that the divergence of 
the gauge singlet axial current, even for massless fermions, is non-zero and proportional to the 
corresponding Chern-Pontryagin (CP) density P 2 n in that (even) dimension 2n i.e. 

d>^jl{x) OC (7) 

It is also well known that the CP Density can always be written as a total divergence 

P 2 n = /x = 0,1,2,...,2ti-1. (8) 

The object A^, for a particular value of fx (say [jl = 2n-l) naturally lives in odd (2n — 1) dimensions 
and is known as the CS density in that dimension. Thus, whereas the CP density lives in even 
space-time dimensions, the CS density lives in odd space-time dimensions. For example, the gauge 
singlet anomaly in 3+1 dimensional quantum electrodynamics is given by 

d'^jl = ( 9 ) 

ZTT TT 

SO that the Abelian CS term in 2+1 dimensions is given by 

Jcs = j Ccsd^x (X j d^xeuXaA‘'F^‘^ . (10) 

Throughout this book we shall mainly be concerned with this CS term or its non- Abelian gener- 
alization. Let us therefore discuss in some detail the various properties of this term. 

3.2 Gauge Invariant Mass Term 

Let us consider pure electrodynamics in the presence of the Chern-Simons term in 2+1 dimensions 
[12, 13] 

C = - \f^.F>‘'' + . (11) 

Since the mass, dimension of A^ is 1/2, hence it follows that the parameter n has the dimension of 
mass. The field equation following from this Lagrangian can be written as 

(5'“' + -e'^‘'°da)*F^ = 0 (12) 

where is the dual field strength which is a vector in 2+1 dimensions i.e. 

*F, = . (13) 

We thus find that, unlike the CP term which has only a nontrivial topology but no dynamics (being 
a total divergence), the CS term has nontrivial topology as well as dynamics in it. On operating 
by (9i3v - to Eq- (12). we get 

(02 _ V2 + ,?)*F0 = 0 (14) 

which clearly shows that the gauge field excitations are massive with the gauge field mass n being 
the coefficient of the CS term. We have thus shown that the CS term when added to the Maxwell 
term, acts as the gauge invariant gauge field mass term. It is worth adding that this remarkable 
property of having a gauge invariant mass term for the gauge field in the action itself is very special 
to 2+1 dimensions. 



Fractional Statistics and CS Field 


707 


3.3 Behavior Under C, P, and T 

Let us consider the behaviour of the CS term as well as the Dirac Lagrangian 

£0 = - m)V' (15) 

under the discrete transformations C (charge conjugation), P(parity) and T (time reversal). Here, 
^ is a two component spinor with mass m(0) and the mass dimension oi ip is 1. We use the 


following two-dimensional realization of the Dirac algebra 

.y0 = <r3-, y (16) 

= 5'“' - ; 9 ^’' = diag.il, -1, -1) (17) 

where are the usual Pauli matrices. 

It is easily shown that under charge conjugation 

= -A ^ , CipC-^ = a^ip+ (18) 

so that the action is invariant imder C. On the other hand, under parity transformation, the gauge 
and the Fermi fields transform as follows 

PA°>2(t,r)p-i = A°>2(t,r') , PA\t,r)P-^ = -A^t,T ’) , (19) 

Pipit, t)P~^ = r') . (20) 


Note that in 2-H dimensions, the parity transformation is somewhat unusual i.e. r = (z, y), r' = 
i-x,y) (or (x, -y)). On the other hand, (-x, -y) corresponds to rotation (and not space reflec- 
tion). As a result, we fiind that the mass terms for both the Fermi and the gauge fields (i.e. mipip 
and the CS term) are not invariant under parity. Similarly, time-inversion changes the signs of 
both the mass terms since 

TA°it,t)T-^ = A®(-t,r) , TAit,T)T-^ = -A(-t,r) , (21) 

Ti;it,r)T~^ = cr^ipi-t,T) . ( 22 ) 

Thus, both the CS term as well as the fermion mass term, are non-invariant under P as 
well as r. However, they are invariant under the combined operation PT and hence the CPT 
symmetry is still valid. Note that in 3 + 1 dimensions though, is invariant under P, C and T 
separately. 

Finally, let us talk about the photon spin. One can show that the CS photon spin is 1(“-1) if 
CS mass /i0(< 0) while the spin of the massless photon is zero. Further, in either case, the photon 
has only one degree of freedom. 

3.4 Coleman - Hill Theorem 

It turns out that because of the P and T violating but gauge invariant CS term, the most general 
form for the vacuum polarization tensor consistent with Lorentz and gauge invariance is more 
general than in other dimensions i.e. 


n;,^(A:) = ik^g^u - k^K)Tli (*=*) - ie^^xk^U^ (fe^) . (23) 

Note that the second term on the right hand side is odd under P and T, It is clear that any P and T 
violating interaction will contribute to 112 For example, the fermion mass term which violates 
both P and T , does contribute to 112 3.t one loop. Remarkably enough, it was discovered that 

at two loops, however, there is no contribution to 112(0) and hence to Chern-Simons mass [11]. 
Inspired by this result, Coleman and Hill [14] have in fact proved under very general conditions 
that 112 (0) receives no contribution from two and higher loops in any gauge and Lorentz invariant 
theory including particles of spin 1 or less (An open question is whether this is also valid for higher 



708 


Avinasb Khare 


spin theories, specially spin-3/2). They only require that the matter fields be massive so that one 
does not have to worry about the infrared problems. Further, they also assume that no part of the 
free electro-magnetic Lagrangian density is hiding in the matter part of the Lagrangian. It may 
be noted that their result is valid even for non-renormalizable interactions in the presence of the 
gauge and Lorentz invariant regularization. 

Coleman and Hill also claimed that at one loop, the only contribution to 112(0) can come from 
the fermion loop. This is, however, incorrect. In particular, there is no reason why P and T 
violating interactions involving spin-0 or spin-1 particles should not contribute to 112 ( 0 ) at one 
loop. In fact, it has been shown that the parity violating spin-0 [15] as well as spin-1 interactions 
[16] do contribute to 112 ( 0 ) at one loop. 

3.5 Magneto-Electric Effect 

There are many crystals in nature like chromium oxide, which show the magneto-electric effect 
i.e., they also get magnetically polarized in an electric field and electrically polarized in a magnetic 
field [17, 18]. It is well known that this effect depends upon having a CP-asymmetric medium. 
Mathematically, the signal for the magneto-electric effect in 2-1-1 dimensions is that the relation 
between the excitation fields D and H and E and B is modified to 

A = ; H = . (24) 

It has been shown [19] that the vacuum of the 2 + 1 dimensional quantum electrodynamics with 

(btti) 

CS term also shows the magneto-electric effect. In particular, it has been shown that both 
and SLre non-zero and proportional to kill 2 {k^)^ Of course this is not really surprising if one 
remembers that the CS term violates the discrete symmetries P and T. 

3.6 Chern-Simons Term by Spontaneous Symmetry Breaking 

We have seen above that the CS term provides mass to the gauge field. Now, usually the gauge 
field mass is generated by spontaneous symmetry breaking; hence it is worth enquiring whether the 
CS term can also be generated by spontaneous symmetry breaking. The answer to the question is 
yes [20]. This is because, unlike other dimensions, in the 2+1 case, one can have a more general 
definition of the covariant derivative. In particular, it is easily seen that 


V^-ip = (d^ - ieA/j, - igSftyxE''^)ip (25) 

also transforms as a covariant derivative, since the field strength F’'^ by itself is gauge invariant. 
Obviously, the same thing is also true for a spin-0 charged scalar field. Now consider the following 
generalized Abelian Higgs model in 2 4- 1 dimensions 

£ = - a(| ^ p -0^)2 (26) 

where the generalized covariant derivative is as given by Eq. (25). On expanding the term 
|(r>^(^)*(P^0), we have 

2 

- ieAn<l> + ^F^.F^’' \ 4> P 

^ig *Ff,i<l>*d>^<f> - m*) + ege^^xid^^A^A^ \ <j> p (27) 

so that if (j) acquires a nonzero vacuum expectation value then the Abelian CS term is generated 
from the last term of this equation. Clearly a similar mechanism should also work for the non- 
Abelian case, but technically it is a tougher problem since one also has to generate the non-linear 
term. 



Fractional Statistics and CS Field 


709 


3.7 Lorentz Invariance From Gauge Invariance 

One of the remarkable properties of the Abelian CS term is that in this case the Lorentz invariance 
of the action automatically follows from the gauge invariance. In contrast, notice that the most 
general form of the gauge invariant Maxwell Lagrangian in classical electrodynamics in 3 4* 1 
dimensions is 

£ = E^^aB\ (28) 

It is only the demand of the Lorentz invariance which tell us that a = ~1 (In the 2 + 1 case, B 
is a pseudo scalar but the same argument is still valid). On the other hand, if one writes the CS 
action as 

Tcs = j d^x[eijE^A^ + aBA^] , (29) 

then the demand of the invariance of Ics under the gauge transformation + (9^a fixes a 

and uniquely gives us the CS action which is automatically also Lorentz invariant. 


3.8 Quantization of Chern-Simons Mass 

Let us now discuss the CS term in the non- Abelian gauge theories. We shall mention only those 
properties which are special to the non-Abelian CS term. To begin with, notice that the non- 
Abelian CS term has an extra term compared to the Abelian case i.e. 

= - Ia^AUx) (30) 

where Ajx and are matrices 

= gT‘^F^, = A, - d^A^ + [A^ , A^] . (31) 

Here, T“ are the representation matrices of the gauge group G satisfying 


[j^a ^ ^abcfpc 


(32) 


where are the structure constants of the group. In the case of 5t/(2), = t“/2L 

Let us now consider a non-Abelian gauge theory with the Chern-Simons term as given by 

£na = +ir(F'“'F^,) - ^s>“'Hr{F^,Ax - Ia.A.Ax) (33) 

As in the Abelian case, it is easily shown that the CS term provides a gauge invariant gauge field 
mass /I. 

As in the Abelian case, the non-Abelian CS Lagrangian density changes by a total derivative 
unda: an infinitesimal local gauge transformation so that, the corresponding action is invariant 
under such a gauge transformation. However, the CS action is not invariant under finite (also 
called homotopically non-trivial, or those which are not continuously deformable to the identity) 
gauge transformations as given by 


U-^AnU + U~'^dM . 


(34) 


As a result, one finds that the action corresponding to the Lagrangian (33) transforms as follows 


Ina + 




nj ^xe>^^^tr{d^[A^{dxU)U-^] 

{duu)u-'^{d^u)u-^{dxu)u~^ 


(35) 


Let us consider those gauge transformations which tend to the identity at temporal and spatial 
infinity so as to avoid a convergence problem i.e. 


U{X)^^ I. 


(36) 



710 


Avinash Khare 


It is now easily seen that the gauge field dependent surface integral in Eq. (35) vanishes. However, 
the last term in the integral is non-zero. It can be converted to a surface integral once the integrand 
is rewritten as a total derivative. This can be made manifest by using an explicit parameterization 
for U , For example, in the case of SU (2) (more generally, we choose 517(2) sub-group of the gauge 
group G; for reasons that will be clear soon), one can make use of the exponential parameterization 
U{X) = exp(ia°'6^{x)). In this way one can show that under large gauge transformations, Ina is 
not invariant but transforms as 

Ina -> Ina + ^^(U) (37) 

where 

w(i7) = ^ J d^xe>“'^tr id^U)U-'^ {dM)U-\dxU)U-^ (38) 

is the winding number of the gauge transformation U, In particular, if the gauge group G is such 
that the third homotopy group of G is non-trivial i.e. 

7rziG) = Z (39) 


where Z is the additive group of integers, then under these so called large gauge transformations, 
the action transforms as 


_ _ 

Ina Ina H 

9 ^ 


(40) 


where m is an integer. Note in particular, that Eq. (39) is true for any gauge group G of which 
SU{2) is a sub-group. However, in the path integral formulation, the action itself may or may 
not be gauge invariant but, it is the exponential of the action {exp{ilna)) which should be gauge 
invariant. In this way we conclude that the non- Abelian gauge theory with the CS term does 
not make sense in 2 -1- 1 dimensions unless the CS mass ix is quantized [13] in units of p^/47r i.e. 
(n = 0,±1,±2, ...) 


Stt^/x 


2m or a — n . 

Att 


(41) 


This mass quantization is reminiscent of the famous Dirac quantization in the case of magnetic 
monopole. An important question to address is whether the quantization condition (41) is respected 
by the quantum corrections. This issue was considered by Pisarski and Rao [21] for the case of 
a pure gauge theory (i.e. without any matter field). They found that the quantization is indeed 
preserved to one loop; however, the integer on the right hand side of Eq. (41) is shifted by N 
in case the gauge group G = SU{N). Subsequently, it has been shown that there are no further 
corrections from two and higher loops in the limit of the pure CS gauge theory [22]. 

How does the quantization condition modify in the presence of the matter fields? It has been 
shown that so long as the scalar field does not break the non- Abelian gauge symmetry, then the 
quantization condition remains unaltered. The massive fermions, of course, modify the quanti- 
zation condition [21] ; the right hand side of Eq. (41) being shifted by where Tr is the 

Casimir generator for the gauge group G (i.e. tr(T^T^) = —S°’^TR)y in case the fermions are in the 
fundamental representation of the gauge group G. Thus the quantization is preserved so long as 
Tr is an integer. 

Much more interesting is the case of partial (spontaneous symmetry) breaking of a non-Abelian 
gauge symmetry. In this case it has been shown that if the non-Abelian gauge symmetry SU {N) is 
spontaneously broken to say SU{M) 7(1) (or even several 7(l)^s), then the one-loop radiative 
correction to the right hand side of the quantization condition (41) [23] arises purely from the 
unbroken non-Abelian sector in question, the orthogonal U (1) sector makes no contribution. This 
implies that the coefficient of the CS term is a discontinuous function over the phase diagram of 
the theory. 



Fractional Statistics and CS Field 


711 


3.9 Parity Anomaly 

Is our entire discussion about the CS term merely of academic interest ? Put differently, some 
one might argue that since the CS term violates both the parity and the time reversal invariance 
symmetries, why should one, in the first place, add such a term to the action ? The answer to this 
question, at least in the non-Abelian gauge theories, is that even if one does not add the CS term 
to the action at the tree level, it is automatically generated by the one loop radiative corrections 
due to the so called parity anomaly [24]. In particular, consider the action 


I[A^, ^] = I d^x 


(42) 


for an odd number of massless doublet of fermions in the fundamental representation coupled to 
SU{2) gauge fields (more generally any gauge group G of which SU{2) is a sub-group so that Eq. 
(39) is satisfied; and the fermions are required only to be in the fundamental representation). 

This action is invariant under the gauge transformations (both large and small) as well as 
the discrete transformations of parity (P) and time reversal invariance (T). However, the effective 
action obtained by integrating out the fetmionic degrees of freedom, violates one of the 

two symmetries. In other words, there is no regularization which can simultaneously maintain 
the invariance of Ieff[-A] under the large gauge transformations as well as P and T. In view of 
the tremendous success of the gauge principle, one usually maintains the gauge invariance at the 
cost of the parity and the time reversal invariance by simply adding the CS term to the action 
(alternately one can also regulate it by using the P and T violating Pauli- Villars regularization). 
In this way, one finds that the CS term is induced by the radiative corrections even if it is absent 
at the tree level. This is very similar to the way the CP term is induced in even dimensions due 
to the gauge singlet (chiral) anomaly. 


3.10 Topological Field Theory 

One of the most remarkable property of the CS action is that it depends only on the antisymmetric 
tensor snd not on the metric tensor As a result, the CS action in the fiat and the curved 
space is the same. Hence, the CS action, in both the Abelian and the non-Abelian cases, is an 
example of the topological field theory [25]. It might be mentioned here, that the topological field 
theories give a natural framework for understanding the Jones polynomials of the Knot theory in 
terms of three dimensional terms. Further, these theories have shed new light on conformal field 
theories in two space-time dimensions. 

Finally, the gravitational Chern-Simons term has also been considered [13] and shown to have 
some remarkable properties. In particular, whereas the massless Einstein theory in 2-1-1 dimensions 
is trivial, it acquire a propagating, massive, spin-2 degree of freedom when the CS term is present. 
Further, even though this topological term has third time derivative dependence, yet the theory 
is ghost-free and unitary and one has a consistent quantum theory. The contribution of the 
topological mass term to the field equations also has a natural geometric significance: it is the 
three dimensional analogue of the Weyl tensor. 


4 Charged Vortex ais Anyon in Field Theories 

In the last section, we have disciissed in detail the various properties of gthe CS term. In this 
section, we demonstrate the most dramatic effect of this term i.e. the existence of charged vortex 
solutions thereby providing us with a relativistic model for the charged (extended) anyons. 

Before we discuss the charged vortex solutions, it might be worthwhile to mention how such 
solutions were historically discovered. A long time ago, Abrikosov [26] wrote down the electri- 
cally neutral vortex solutions in the Ginzburg-Landau theory which is a mean-field theory of 
superconductivity. Subsequently, these vortices were experimentally observed in the type-II su- 
perconductors. Nielsen and Olesen [27] rediscovered these solutions in the context of the Abelian 



712 


Avinash Kbare 


Higgs model which is essentially a relativistic generalization of the Ginzburg-Landau theory. These 
people were looking for string-like objects in relativistic field theory. It turns out that these vortices 
have finite energy per unit length in 3 + 1 dimensions (i.e. finite energy in 2 4- 1 dimensions as 
the vortex dynamics is essentially confined to the x-y plane), quantized flux, but are electrically 
neutral and have zero angular momentum. Subsequently, Julia and Zee [28] showed that the 50(3) 
Gerogi- Glashow model which admits t’Hooft-Polyakov monopole solution, also admits its charged 
generalization i.e. the dyon solution with finite energy and finite, non-zero, electric charge. It was 
then natural for them to enquire whether the Abelian Higgs model, which admits neutral vortex 
solutions with finite energy (in 2-f-l dimensions), also admits its charged generalization or not. In 
the appendix of the same paper, Julia and Zee discussed this question and showed that the answer 
is no i.e. unlike the monopole case, the Abelian Higgs model does not admit charged vortices with 
finite energy and finite and non-zero electric charge. More than ten years later, Samir Paul and 
I [29] showed that the Julia-Zee negative result can be overcome if one adds the CS term to the 
Abelian Higgs model. In particular, we showed that the Abelian Higgs model with CS term in 
2-1-1 dimensions admits charged vortex solutions of finite energy and quantized, finite, Noether 
charge as well as flux. As an extra bonus, it was found that these vortices also have non-zero, fi- 
nite angular momentum which is in general fractional. This strongly suggested that these charged 
vortices could in fact be charged anyons which was subsequently rigorously shown by Prohlich and 
Marchetti [30]. 

Strictly speaking, what one has obtained are the charged soliton solutions and not the vortex 
solutions, but because of the close connection with the neutral vortex solutions, one has continued 
to call them as charged vortices rather than charged solitons. 

Consider an Abelian Higgs model with CS term as given by 

£ = - (74(1 ck P + ^e^uxF'^^'A^ (43) 

where jjl is the Chern-Simons mass, 0 denotes complex scalar field and D^cf) is the covariant 
derivative i.e. 

= {dp, - ieA^)<f > . (44) 

Here (f), Ap, as well as the gauge coupling constant e have mass dimension of 1/2 while and C 2 
have mass dimensions of 1 and 2 respectively. In order to obtain the charged vortex solutions, let 
us consider the following ansatz 

A{x,t) = 4>(x,t) = Co/(r)e*"®, ^(x,t) = Coh{r) (45) 

T 

where p(r),/i(r),/(r) are the dimension-less fields, r is the dimension-less length, while Co has 


mass dimension of 1/2 i.e. 

r = eCop, Co = \/ C 2 / 2 C 4 . 

(46) 

Note that p and 0 are related to x and y by /? = 4- 2 /^ tanO — yf x. It turns out that 

even though the Lagrangian (43) has so many parameters, the dynamics essentially depends on 
two dimension-less variables, 5 and A defined by 

A = V 8 C 74 M 5 = nleCo. 

(47) 

The field equations which follow from here are 


g'^r) - -g'ir) - gf = Srh'{r) 
r 

(48) 

h"{r) + h'ir) - hf = pg'ir) 

(49) 




( 50 ) 



Fractional Statistics and CS Field 


713 


where g^(r) = dg{r)/dr. The corresponding field energy can be shown to be 


/-OC 

Ffi — ^^0 I rdv 
Jo 


^ '^ 9 \2 , . /^^\2 . 1.2 £2 , ^ 






(51) 


Several remarks are in order at this stage. 

1. As expected, in the limit h = 0 (i.e. Ao = 0) and (J = 0 (i.e. /i = 0) the field equations 
reduce to those of the neutral vortex case. From the Gauss law Eq. (49) it also follows that 
if S (i.e. fi) is non-zero, then Aq must also be non-zero thereby justifying the ansatz (45) . 


2. The boundary conditions for finite energ>^ solutions are 


lim f(r) = 1, /j(r) = 0 = g(r) 

r— J-oo 

(52) 

lim f(r) = 0,g(r) = n,h(r) = ^ 

r-^Q 

(53) 


where ^ is an arbitrary number while n = 0, ±1, ±2... . 


3. FVom these boundary conditions it immediately follows that the magnetic fliix is quantized 
in units of 27r/e i.e. 



It may be noted that even for the neutral vortices, the flux is quantized in units of 
The underlying reason for the flux quantization is same in both the cases i.e. both are 
topological objects with the underlying boundary conditions being such that there is a non- 
trivial mapping from the space time to the group manifold i.e. tti ([/(!)) = Z, with Z being 
the set of integers, forming a group under addition. 


4. Prom the Gauss law Eq. (49), it then follows that these vortices also have a non-zero and 
finite Noether charge which is quantized in units of 27r/i/e. This is easily seen by noting that 
in terms of the electric and the magnetic fields, the Gauss law equation can be written as 


V-EA/iB = p 


(55) 


where p is the Noether charge density. On integrating both sides of this equation, it then 
follow'S that 

Q = J pd^x = fjL j Bd^x = . (56) 

Note that / V • EcPx = 0, since, because of the Higgs mechanism, both E and B fall off 
exponentially at long distances. This is probably for the first time that the quantization of 
the Noether charge has followed from purely topological considerations. In a sense, relation 
(56) can be looked upon as the (24-l)“analogue of the Witten effect [31]. Let us recall the 
work of Witten who had shown that in the presence of the CP and T violating CP term, 
the t’Hooft-Polyakov monopole acquires electric charge whose fractional part is proportional 
to the coefficient of the CP term. It must however be remembered that whereas the Witten 
effect is purely a quantum mechanical effect, in our case, the vortices acquire a non-zero 
charge at the classical level itself due to the presence of the CS term. 

5. It is also clear from here that in the Abelian Higgs model (without the CS term), one cannot 
have vortices having simultaneously the finite energy .as well as the finite, non-zero Noether 
charge. The point is, in the absence of the CS term, the Gauss law Eq. (55) gives on 
integration 

Q = j p(fx = j V-E(fx, (57) 

The only way Q can be non-zero and finite is if there is a non-zero contribution to the integral 
around r 0 i.e. if E 1/r as r 0. But in that case, the electrical field energy fE^cPx 
diverges logarithmically [28]. 



714 


Avinasb Kbare 


6. The energy-momentum tensor for this model can be obtained by varying the curved 
space form of the action with respect to the metric 

- 9^u{C - (58) 

where the Lagrangian £ is as given by Eq. (43). Note that the CS term, being independent 
of the metric tensor does not Contribute to the energy momentum tensor T^j,, Using 
this and the field equations, the angular momentum carried by the charged vortices can 
be shown to be 

= = (59) 

Thus, unlike the neutral vortices, the angular momentum of the charged vortices is non-zero 
and is solely determined by their charge and flux. Besides, the angular momentum of n su- 
perimposed charged vortices is and not n times the angular momentum of a single vortex. 
Further, since the CS mass /x is not quantized in the Abelian case, hence this angular mo- 
mentum J can in general take any fractional value. This strongly suggests that these charged 
vortices are charged anyons. Prohlich and Marchetti [30] have in fact rigorously proved that 
these charged vortices are charged anyons. In particular, they constructed quantum one 
vortex operator and then evaluated the phase acquired when one such vortex is slowly taken 
round the other. They also show that the charged vortices cannot be localized in bounded 
regions but are localized in space-like cones in three-dimensional Minkowski space-time [32]. 
Unfortunately their treatment is rather involved and is beyond the scope of this pedagogical 
article. Thus the solitons of the Abelian Higgs model with the CS term provides us with a 
relativistic field theory model for the extended charged anyons. 

7. The magnetic moment of these vortices can be computed by using the field equations and 
one can show that, whereas for the neutral vortices it is equal to the flux $(= 27rn/e), the 
charged ones acquire an extra contribution 

[ {t X J)z d?x = f rh{r)dr . (60) 

J e e Jo 

4.1 Unusual Higgs Mechanism 

One must now solve the field Eqs. (48) to (50) and show the existence of the charged vortex 
solutions. To date, no analytic solution has been obtained of these field equations. However, it is 
easily seen that for large r, the asymptotic values of the gauge and the Higgs fields are reached 
exponentially fast 

gir) = a±V?e-’’±’- + ... , Hr) = + ... , (61) 

fir) = l + 0e-^^ + ... ( 62 ) 

where a± and P are dimension-less constants while the dimension-less vector meson mass is 
given by __ 

V± = ^1 + ^ ± I • (^3) 

However, it has subsequently been shown that the solution with does not exist for all r. 

On noting that the field Eqs. (48) to (50) are invariant under r — r, it is easily shown that 
the behavior of the gauge and the Higgs fields around r = 0 is given by 

J.2 

g{r) = n + air^ + 0{r*) , h(r) = /d + ai5— + O(r^) , (64) 

f{r) = a 2 rl"l + 0 (rl"l+ 2 ) . 


(65) 


Fractional Statistics and CS Field 


715 


Detailed numerical work has subsequently confirmed the existence of the radially symmetric 
charged vortex solutions with these boundary conditions [34] . These correspond to n superimposed 
vortices. The qualitative behaviour of the charged vortex solution which follows from here is as 
follows : the magnetic field B decreases monotonically from its non-zero value at the core of the 
vortex (r = 0) to reach zero as r oo with the penetration length I/ 77 -, while the Higgs field 
increases from zero at the origin to its vacuum value at infinity with coherence length 1/A. Finally, 
the electric field Ep which is radial, vanishes both at r = 0 and r = 00 reaching the maximum 
in between at some finite r. It is worth pointing out that as in the quantum Hall effect, for the 
charged vortex solutions too, E(= Ep) is at right angles to J(= je) and both in turn are at right 
angles to B. 

Why did one obtain two asymptotic solutions for g and h, i.e. for the gauge fields Ae and AqI 
This is because of the unusual nature of the Higgs mechanism in 2 + 1 dimensions in the presence 
of the CS term. Notice that in our case both the Maxwell and the CS terms are present and in 
addition there is also Higgs mechanism in operation. Clearly such a theory must still propagate 
only two massive modes. As has been shown in [35], in this case Cguad corresponds to Proca 
equation with the CS term. It propagates a self-dual field with two distinct CS type masses and 
that corresponding to each mass there is one (P and T violating) propagating mode. Further, the 
two masses (in dimension-less form ) are precisely ri± as given by Eq. (63) thereby explaining the 
reason for the occurrence of two asymptotic solutions 

4.2 Vortex- Vortex Interaction 

One of the most inter^ting question is whether these charged vortices can be observed experi- 
mentally in some planar system. In this context recall that the neutral (Abrikosov) vortices have 
been experimentally seen in type-// superconductors. This can be understood from the fact that 
whereas the vortex- vortex interaction is repulsive in the type-// region (Al), it is attractive in the 
type-/ region of superconductivity. It is thus of great interest to study the charged vortex- vortex 
interaction and to see when is it repulsive. This has been done both in the perturbation theory 
(in the CS mass) and by the variational calculation -[34], and in both cases one finds that the 
charged vortex- vortex interaction is more repulsive than the corresponding neutral case with the 
extra repulsion coming from the electric field of the charged vortex. For example, when the CS 
mass is small, then on expanding the charged n- vortex fields in terms of the corresponding neutral 
vortex fields it has been shown that 

E„ (A, J) - nEi (A, <J) = En{X, 0) - nEi (A, 0) + + 0(6^) (66) 

so that the charged vortex-vortex interaction is always more repulsive than the corresponding 
neutral case. For example, for 5 = 0.5, one finds that the charged vortex-vortex interaction is 
repulsive even for A > 0.45 (note that in the neutral case the interaction is repulsive only if Al). 

4.3 Non-Abelian Charged Vortex Solutions 

It is clearly of considerable interest to enquire whether the charged vortex solutions obtained above 
can be embedded in non-Abelian gauge theories with the CS term. The first obvious question is 
whether such vortices could be topologically stable or not. It is easily seen that if G is the gauge 
group of the non-Abelian gauge theory and H is the sub-group under which the vacuum remains 
invariant after spontaneous S 3 Tnmetry breaking, then topologically non-trivial vortices are possible 
only if 

(67) 

In the case of SU {N) gauge theories, it turns out that no Z- vortices are possible. However, Z^- 
vortices are possible in case H is since ■ki{SU{N)/Zn) = Z^. It turns out that at least N 
Higgs multiplets are required so that the vacuum is invariant under Zn [36]. As a result, only one 
non-trivial charged vortex is possible in the case of SU{2) gauge theory with flux $ = 27 r/p, charge 
Q — fj,^ = 2irii/g, and angular momentum J = — Q$/47r = —'K/j./g^ where g is the gauge coupling 



716 


Avinasb Kbare 


constant. But since the CS mass ji is quantized in non- Abelian gauge theories having SU{2) as its 
sub-group i.e, 

fji = — n, n = 0, ±1, ±2, ... (68) 

47r 

and hence the vortex charge is gn 12 i.e. it is quantized in units of ^/2 while the angular momentum 
is quantized in units of 1/4 i.e. J = -n/4. This is remarkable as it strongly suggests that if the 
usual spin-statistics connection is valid then whereas the Abelian charged vortex is an anyon with 
any phase factor, the non- Abelian {SU{2)) charged vortex can only be a semion, a fermion or a 
boson. 

4.4 Relativistic Pure Chern-Simons Vortices 

We have obtained above the charged vortex solutions in case the gauge part of the Abelian Higgs 
model consists of both the Maxwell and the CS term. It may be of some interest to enquire whether 
the Abelian Higgs model with pure CS term can also admit charged vortex solutions. This question 
is specially relevant in the context of condensed matter systems since in the long wave length limit, 
the CS term having one derivative dominates over the Maxwell term which has two derivatives. It 
turns out that the answer to the question is yes [37]. 

In the absence of the Maxwell term and with the same rotationally symmetric ansatz as in Eq. 
(45), it follows from Eqs. (48) and (49) that the gauge field equations are already of first order. 
However, Eq. (50), for the Higgs fields, is still a coupled second order equation. We now show that 
in case one replaces the standard double well ^'^-type potential by the following 0®-type potential 
[38] 

= (69) 

then even the Higgs field satisfies a first order equation. It is worth pointing out here that whereas 
a Higgs potential of the type I 0 T 0 S ^ ^ 4 is renormalizable in 3 -h 1 dimensions, 

with 0 < i < 6 is renormalizable in 2 + 1 dimensions. 

When the Maxwell term is absent and the Higgs potential is as given by (69), the vortex energy 
(51) can be rewritten as 

En = nCi j" rdr [(/' T ^fg? + f[h T ^ " f)9]] ■ (70) 

This gives a rigorous lower bound on the energy in terms of the flux 

En > ±^( 72 [ 3 ( 0 ) - fl(oo)] = ±ie(7o'§ (71) 

since the finite energy consideration requires that pg vanish at both the ends. This bound is 
saturated when the following self-dual first order equations are satisfied 

f{r)=±lfg (72) 

r 

= ^ = (73) 

It is easily checked that these first order equations are consistent with the second order field Eq. 
(50). One can in fact decouple these coupled first order equations and show that the Higgs field / 
must satisfy the following un-coupled second order equation 

f"{r) + lf'ir)-p^ + -f)=0. (74) 

Several comments are in order at this stage. 



Fractional Statistics and CS Field 


717 


L These self-dual equations are similar to those of the Nielsen-Olesen (neutral) self-dual vortices 
(which are valid only if A = 1). 

2. Whereas the Lagrangiau for the self-dual neutral vortex case (i.e. Lagrangian (43) with // = 0 
and A = 1) is the bosonic part of a N = 1 supersymmetric theory [40], the Lagrangian for 
the seif-dual charged vortex case (i.e. the Lagrangian (43) with the Maxwell term being 
absent and the Higgs potential being as given by Eq. (69)) is the bosonic part of a N = 2 
supersymmetric theory [41]. 

3. The <?^^-potential as given in Eq. (43) and the potential as given by Eq. (69) represent 
very different physical situations. For example, whereas the -potential corresponds to the 
case of the second order phase transition with T < the -potential as given in Eq. (69) 
corresponds to the case of first order phase transition with T = T/ [42]. 

4. The nature of Higgs mechanism when only Chern-Simons term is present is somewhat unusual 
[39]. One finds that in the limit oo, /x oo, with their ratio fixed, the mass 
decouples from the theory. Thus in the case of the pure CS term, one finds that after the 
Higgs mechanism, the gauge field is massive and propagates one mode. 

Let us now discuss the most remarkable property of the self-dual Eqs. (72) and (73). In 
particular, since the Higgs potential (69) has degenerate minima at j 0 |= 0 and [ (j) ]= C7o, hence, 
it turns out that at the self-dual point, one can simultaneously have both the topological and the 
non-topological charged vortex solutions. It is worth pointing out that at the time of this discovery, 
no other self-dual system was known which exhibited this remarkable property. 

4.5 Topological Self-dual Solutions 

The topological, self dual charged vortex solutions satisfy the same boundary conditions as given 
by Eqs. (52) and (53) with ^ = /i(r = 0) = ±1/25^. Note that the upper (lower) sign corresponds 
to n > 0(< 0). As a result, the flux the Noether charge Q, and the angular momentum J of 
these charged vortices are again as given by Eqs. (54), (56) and (59) respectively while the energy 
of these charged vortices is ttCq | n | . Prom now onwards, we shall confine our discussion to the 
case of n > 0 i.e. those corresponding to the upper choice of sign. Solution with n < 0 are related 
to these by the transformation g f 

A countable infinite number of sum rules have been derived [43] and using the first two, it has 
been proved that the magnetic moment of the topological, self-dual charged n-vortex is given by 
[44] 

= 27rn(n 4“ 1) — . (75) 

e 

Note that for the neutral n-vortex, iiT^ = $ = 27rn/e. 

No analytic topological self dual charged vortex solution has been obtained as yet. However, 
one can show that all the fields approach their asymptotic values exponentially fast. It may be note 
that at the self-dual point, the vector and the scalar meson masses are equal. Further, whereas 
for the Maxwell-CS case, the magnetic field is maximum at the core of the vortex (r 0), for the 
pure CS vortices, the magnetic field is zero at the core of the vortex and is concentrated in a ring 
surrounding the vortex core. 

4.6 Non-topological Self-dual Solutions 

Since j ^ |= 0 as well as | (^ [= Co are degenerate minima of the Higgs potential (69), hence it 
turns out that one could also have non-topological self-dual charged vortex solutions [44, 45]. In 
this case, the finite energy considerations demand the following boundary conditions 



718 


Avinash Khare 


fir) = 77 , g{r) = 0 for n = 0 (78) 

where rj is an arbitrary number while -~a(-+-a) is for n0(< 0). As a result, the flux, the charge, the 
energy and the angular momentum of these vortices for (nO) are 

$ = — (n + a) , Q = -f a) , 

e e 

J = -^(a^ — n^) , £ = 7rC|(n -f a) . (79) 

Note that unlike the topological case, the angular momentum is no more equal to —Q(j>/ 47 r, Here a 
is a positive number but how much is it? The finiteness of energy requires that a > 1 but otherwise 
a seems to be completely arbitrary. However, it is not so and we now show [46] that a satisfies a 
rigorous lower bound of a > n + 2 . To this end, consider the self-dual Eq. (73). On integrating 
both sides of this equation and using boundary conditions (76) to (78), one obtains (for n > 0 ) 

~l = ^ + “ = rdrf{l-f)>0. (80) 

Similarly, on using Eqs. (72) and (73) we have on integration 

On integrating by parts and using the fact that and vanish as r 00 (note /(r) ~ 
with q: > 1 as r -4- 00 ), we then have 

1 1 

{a^ -n^) = — rdr{f - -/^) . (82) 

On combining the two sum rules, we then have 

(a -h n)(a - n - 2) = ^ J rdrf^ > 0 (83) 

which gives us a rigorous lower bound on a i.e. ol > n -f 2. It turns out that this bound is never 
saturated in the relativistic case. However, as we shall see below, it is indeed saturated in the case 
of the non-relativistic self-dual non-topological charged vortices. It may be noted here that there 
is however no upper bound on a. We thus conclude that the flux of the relativistic non-topological 
vortices must necessarily be greater than 47r(n -f- l)/e. More remarkable is the fact that whereas 
the angular momentum of the topological vortices is always negative and proportional to the 
angular momentum of the non-topological vortices, on the other hand, is necessarily positive and 
in general is not proportional to n^. Further, the magnetic moment of the non-topological vortices 
has also been computed analytically by using the sum rules and shown to be negative [46] 


i^^ = «_(a + n)(a-n~l)<0. (84) 

e 

Note that the magnetic moment of the topological vortices is on the other hand always positive. 

Are these non-topological vortices stable or do they decay to the charged scalar meson ? This 
question has been discussed [47] and it has been shown that as far as the decay to the scalar meson 
is concerned, these non-topological solitons are at the edge of their stability. In particular, using 
E and Q as given by Eq. (79) and noting that the mass m of the scalar particle in the symmetric 
vacuum is e^cg/2/i;, it follows that E = mQ/e. Thus the stability does not impose any upper 
bound on the charge of the non-topological soliton. No analytic solutions of Eqs. (72) and (73) 
have been obtained as yet in the non-topological self-dual case. However, the behavior of the fields 
near r ^ 0 and for large r is easily obtained. In particular, using the boundary conditions (76) to 
(78), it is not difficult to show that for r oo, the n = 0 vortex solution has the behavior 


4(a - l)(r/(5)2“-2 


g{r) = -a + 


+ 0((r/<5)-'‘“+‘‘) 


(85) 



Fractional Statistics and CS Field 


719 


f(r) = 


Go 

{r/5Y 


Gl 

8(a - 


+ 0((r/5)~^"+^). 


( 86 ) 


On the other hand, as r 0, while /(O) is not constrained, ^(0) must vanish so as to have a 
non-singular solution. Thus for the n = 0 non-topological vortex, the magnetic field {—g^{r)/r) 
is maximum at the core of the vortex (r = 0) and falls off with a power law fall off as r — > oo. 
Note, however, that the magnetic field for the topological CS vortices is zero at the core, and is 
maximum in a ring surrounding the core of the vortex. 

Finally, let us consider the behavior of the n (we as usual consider n > 0) non-topological 
self-dual charged vortex solutions. It is easily shown that these solutions are hybrids of the two 
previous cases i.e. their large distance behavior is the same as those of the n = 0 non-topological 
charged vortex solutions as given by Eqs. (85) and (86). On the other hand their short distance 
behavior is the same as those of the self-dual topological charged vortex solutions. Thus for n 7^ 0 
non-topological vortices, the magnetic field vanishes at the core of the vortex and falls off with a 
power law fall off as r 00. 

It is worth pointing out that since the -potential as given by Eq. (69) has two disconnected 
but degenerate vacua at | 1= 0 and [ <j) |= Co, hence, apart from the charged vortex solutions, 

they also possess one dimensional domain wall solutions [45, 42]. 

So far, we have only discussed the self-dual rotationally symmetric CS vortices. However, the 
self-dual solutions can in fact be obtained even without choosing the rotationally symmetric n- 
vortex ansatz (45). Further, rigorous arguments have subsequently been given for the existence 
of the self-dual topological [48] and non-topological [49] charged vortex solutions even when the 
vortices are not superimposed on each other but lie at arbitrary positions in the plane. Let us 
note an interesting fact about the angular momentum of these charged vortices. For example, 
whereas the angular momentum of the n superimposed topological vortices is times that of 
a single vortex, the angular momentum of the n topological vortices (each of which has unit 
vorticity) which are well separated from each other, is only n times the angular momentum of the 
single vortex. However, the energy, flux and the charge of the n vortices in both the cases is the 
same. Thus we see that whereas the energy, flux and charge, are the global quantities, the angular 
momentum of a configuration depends on the local behavior. 

A zero-mode analysis of the spectrum of small fluctuations [45] around the self-dual vortices 
indicates that whereas the number of zero modes in the case of the topological self-dual vortices is 
2n, in the non-topological case, the same number is 2n -I- 2[a] where [a] denotes the integer part 
of a. In the topological case, this number is identified with the number of parameters required to 
describe the location of the n vortic^ while the counting is less clear in the non-topological case. 


4.7 Interaction Between Self-Dual CS Vortices 

The slow motion of the Abelian self-dual CS vortices has been analyzed [50] using Manton’s 
technique [51]. In this approach, one constructs an effective quantum mechanical Lagrangian 
(not density) which describes the fluctuations about the static self-dual classical configurations 
and not surprisingly, one obtains a statistical interaction term. Further one also obtains a term 
corresponding to the velocity dependent Magnus force. It turns out that this force is in fact 
necessary in order to have correct spin-statistics relation. 

Self-dual charged vortices have also been obtained in the original <?!>^-type model itself by adding 
a neutral scalar field to Eq. (43) and changing the (?^^-potential suitably [53]. 

Finally, semi-local self-dual CS vortices have been obtained in an Abelian Higgs model with 
pure CS term [52] and with SU{N)giabai U{l)iocai symmetry. The interesting point is that the 
semi-local vortices, even though topologically trivial, are stable under small perturbations due to 
the gradient energy term. 



720 


Avinasb Kbetre 


4.8 Non-relativist ic Chern-Simons Vortices 

Let us now discuss the non-relativistic limit of the Abelian Higgs model with the pure CS term. 
The Lagrangian density for the Abelian Higgs model with pure CS term is given by 

C = \iD,<l>riD>^4>) + _ ^ | 0 |2 (| ^ ,2 _^2)2 ^ (87^ 

where the Higgs potential is as given by Eq. (69). Here we write all the factors of the velocity of 
light c explicitly since we are considering the non-relativistic limit of a relativistic theory. Let us 
first note that the quadratic term in the Higgs potential defines the mass through its coefiicient 
m^c^/2. Comparison with Eq. (87) shows that Cq must have the value Cq = (2 | /z | mc?)le^ so 
that the Lagrangian density (87) can be rewritten as 

The non-relativistic limit (c -> 00) now proceeds in the standard manner. On writing the mode 
expansion of the scalar field (p as 





tj) + 


(89) 


and substituting it in Eq. (88), dropping all terms that either oscillate as c 00 or are sub-leading 
in powers of c, the matter part of the Lagrangian density can be shown to be 

£ = irDoi, f . (») 

Here p = -0*^ is the matter density of particles and we have dropped the anti-particle part from 
the Lagrangian density (i.e. we are working in the zero anti-particle sector) by setting ^ = 0 since 
the particle and the anti-particle parts are separately conserved. The remarkable fact is that one 
now has an attractive quartic (p^) self-interaction. This non-relativistic model can be looked upon 
either as a non-relativistic classical field theory or as a second quantized iV-body problem with 
2-body attractive delta-function interaction. 

The Euler-Lagrangian equations of motion which follow from the Lagrangian density (90) are 


1 

. tp ^ ; 

2m mc \ p\ 


\xp — iDo'tp = 0 


(91) 


F^u = (92) 

where = (p, j) is a Lorentz covariant notation for the conserved non-relativistic charge and 
current densities i.e. ^ 

p =1 ^2 (_ Jk ^ ■ (93) 

The field Eqs. (91) and (92) are together termed as the planar gauged nonlinear Schrodinger 
equations. The gauge field Eq. (92) can also be re-expressed as 



Fractional Statistics and CS Field 


721 


From here, we immediately obtain the fundamental relation between the Noether charge Q and 
the magnetic flux $ i.e. Q — As in the relativistic casej it is easily checked that the second 
order field Eqs. (91) and (92) are solved by Eq. (94) and the self-dual ansatz 


Djip = ±iejkDk'>p 


in the case of the static solutions with Aq chosen as 


Ao = T 


e 

2miJLC 


■ 


Here we have made use of the following factorization identity 


DV = -D±-DTV>T^-Pi2^^- 


(96) 


(97) 


(98) 


We now show that the self-dual Eqs. (94) and (96) can be solved completely and explicitly. On 
writing the complex field ip astj} = the self-duality Eq. (96) yields the vector potential 

Ai = ditJ ± In p (99) 

which is valid away from the zeros of p. On inserting this form of A into the other self-dual Eq. 
(94) yields the famous Liouville equation 


V^lnp = 



( 100 ) 


which is known to be integrable and completely solvable and which must be solved away from 
the zeros of p. It is worth noting that with our sign conventions, we have the Liouville equation 
with the correct sign in that only such an equation has real, positive, regular solutions. The most 
general such solution is known to be given by 


lnWI'iz) 1^ 
e^[l+\nz) P] 


where f{z) is any holomorphic function and z = Explicit radially symmetric solutions may 
be obtained by taking f(z) = (z/zq)^”. The corresponding self-dual charge density is 


4\fi\n^c (r/ro)^(”-') 
eVg [l + (r/ra)2-p 


( 102 ) 


which behaves like as r 0 while as r oo, it behaves like Thus p is regular at 

the origin if n > 1. Prom Eq. (99) it then follows that as r 0, the vector potential behaves as 


Ai (r) ~ dioj ± — —Eij ^ (103) 

e r^ ^ 

i.e. it is singular at r = 0. This singularity is removed if we choose u) = ±c(n — 1)6 /e. Thus the 
profrle of the self-dual ^ field is given by 


ero [l-t-(r/ro)2'*] 


g±t(n-l)9 


(104) 


On requiring that ^ be single valued, we then find that n must be an integer, and for p to have 
decayn^ bdiavior as r -> oo, we require that n must be positive. 

Several comments are in order at this stage. 



722 


Avinash Khare 


1. Integrating p as given in (102) over all space yields n (the total number of particles) and 
hence the flux (in view of Eq. (94)). We obtain ^ = (47rc72/e) with n = 1,2,... which 
means that this configuration carries an even number of flux units. This is in contrast to the 
relativistic case where the flux unit need not necessarily be even. Further, note that unlike 
the relativistic non-topological case, here the lower bound on a(> n + 2) is saturated. As has 
been shown [55], this is because of the special inversion symmetry of the Liouville equation. 
In particular, notice that the Liouville equation is invariant under the transformations 

r 1/r, 9-^6^ p(r) p{l/r) = r^p(r) . (105) 

As a result, the behavior of p at infinity is uniquely determined by its behavior at the origin 
thereby fixing a = n + 2. 

2. It is worth pointing out the Q, $ and J for the non-relativistic charged vortices are the same 
as those for the relativistic non-topological charged vortices as given by Eq. (79) provided 
one chooses a = n -h 2 (note that in the non-relativistic case, n = 1,2, ... while n = 0,1,2,... 
in the relativistic case). 

3. The radially symmetric solution (104) was obtained by choosing the holomorphic function 

f{z) oc and corresponds to n solitons superimposed at the origin with common scale 

factor tq. The most general solution corresponding to n separated solitons may be obtained 
by taking 

where 2n real parameters Zi describe the location of the solitons and 2n real parameters ai 
correspond to the scales and the phases of the solitons. Thus the solution depends on 4n 
parameters. Using an index theory calculation [56] it has been shown that this is the most 
general solution. 

5 CP^ Solitons With Hopf Term 

In this section we discuss the extended (neutral) anyon solutions in relativistic field theories. 
Historically, such solutions were first written down in the case of 0(3) a-model with Hopf term 
in 2-1-1 dimensions [57]. Unfortunately, in this case, the Hopf term cannot be written down as a 
local function of the basic fields of the theory. Therefore, we shall discuss the essentially equivalent 
example of the CP^ model with the Hopf term since in this case the Hopf term can be written 
down as a local function of the basic fields of the theory [58]. 

The action for the CP^ model in 2-1-1 dimensions is given by 

1 = j <fx {D^zYiD^z) (107) 

where D^z = (5^ - iA^)z with z = (zi,Z 2 ) being a complex vector fulfilling | z p= 1. Note that 
Afj, here does not represent independent degrees of fireedom,. but is entirely determined in terms of 
z{x) through the constraint equation 


A^ = -iz* d^z . (108) 

gyThe action (107) is invariant under the local {/(I) transformations 

Za{x) Zaix)e^^^^\ A^{x) A^x) + d^Aix) . (109) 

As is well known, the CP^ model admits self-dual soliton solutions. To obtain them, let us 
first note that the field equation is obtained by extremizing the action (107) with respect to z{x) 
subject to the constraint | z p= 1. This constraint is best introduced in the variational formalism 



Fractional Statistics and CS Field 


723 


by using a Lagrangian multiplier i.e. one extremizes I 4- f d^xA(x)(z*z 1). The resulting field 
equation is 

(D^D^ + A)z=-0. (110) 

The Lagrange multiplier A(i) is eliminated by using A = Az*z = —z*Dy,D^z. Let us now consider 
the static solutions. In this case, the field equation (110) reduces to 

V'^z-{z' ■'^^z)z = 0. (Ill) 

The energy of a static solution as obtained from the action (107) is clearly 

E = j{Diz)*iDiz)(fx, 1 = 1,2. (112) 

Finiteness of energy requires that as r =| x [-)• oo, DiZ = diZ - iAiZ = 0. 

Let us start from the topological inequality which follow's from 


{Dizy ±i6ij{Djzr 


DiZ T iBikDkZ 


> 0 . 


(113) 


Because of the constraint | z |^= 1, this inequality can be re-expressed in the form 


{DizY ■ {Diz) > BijiDiz)* • (Djz) (114) 

so that the energj- is bounded from below by the topological charge Q i.e. E > 2Tr \ Q \, where 

Q = - ^ /d"x BijiDizY - {Djz) . (115) 

In any Q-sector, the energy reaches its minimum when the fields minimize the energy in that sector 
and satisfy the first order self dual field equation 

DiZ = ±i SijDjZ . (116) 

Note that the solutions of Eq. (116) automatically solve the second order field Eq. (Ill) while the 
converse need not be true. 

The most general solution for z can be written down in terms of (anti) holomorphic function uj 



These solutions are characterized by the energy £ = 27r | Q | where Q is as given by Eq. (115). 
One can in fact define a topological current 

J** = -^B>^''\D,znDxz) (118) 

which is conserved by construction, and the topological charge Q as given above, is related to it 
hy Q — J J^dPx. One can easily show that for the soliton solutions, Q is just the winding number 
i.e. Q clearly d^cribes the homotopy of the mapping S 2 S 2 . 

Since is the topological conserved current, hence one can clearly add the following gauge 
invariant action 

Ih = J cPx-^A^Ji^ (119) 

to the original action (107) . This action is nothing but the Hopf term which is related formally 
to the CS term since from Eqs. (108) and (118) it follows that 

A^J'^ = 


(120) 



724 


Avinash Khare 


Note however that here is not an independent gauge field but is entirely determined in terms 
of z(x) through the constraint Eq. (108). As a result, unlike the CS term, the Hopf term is locally 
a total divergence and hence does not contribute to the equations of motion. 

Note that unlike the CS term, the Hopf term has no dynamics. Besides, for the CP^ soliton 
solutions (which are time independent solutions of the equations of motion), the Hopf term is 
identically zero because of the time derivative and the relationship (108). Thus the way the Hopf 
term imparts fractional spin and statistics to the soliton is similar to that in quantum mechanics but 
it is very different than the way the CS term imparts fractional spin and statistics. In particular, 
since the Hopf density is a total divergence, hence the Hopf action can be expressed in terms of 
the surface terms, namely two integrals at the initial and final times so that in the path integral 
formalism, the contribution of this action is essentially in terms of the phases of the initial and the 
final wave functions. Since the configuration space in question is multi- connected, the Hopf action 
depends on the homotopy classes of the path and, therefore, the converted phases are multi-valued 
which in turn gives rise to the fractional spin (= 0/27r) and the solitons obey fractional statistics 
characterized hy 6 [57, 58]. 


6 Anyons as Elementary field Quanta 

In this section we enquire whether one can construct local quantum field theories where the funda- 
mental fields represent the creation and annihilation of anyons. Let us consider a complex bosonic 
non-relativistic matter field '0(x, t) of mass m (of course a similar discussion can also be done for 
the fermionic matter field). Let us minimally couple it to an Abelian gauge field A^ with a CS 
kinetic term [5, 59] 


S = I <fx[iiP+DorP + + PDV- + (121) 

where ~ ieA^ is the covariant derivative. For simplicity, in this section we shall set 

c = 1. On varying the action with respect to we obtain 

= —J^ ( 122 ) 

where the current is explicitly given by 

- (^''V’)+^] • (123) 

Zmi 

Here p and J are the number density and the current density operators respectively which satisfy 
the continuity equation dtp + V. J = 0. As seen in previous sections, Eq. (122) is a remarkable 
relation indicating that the CS field strength is completely determined by the particle current. 
Even more remarkable is the fact that the gauge potential A^, itself is not an independent degree 
of fireedom. 

Let us consider the p = 0 component of Eq. (122) 

J5 = -p (124) 

where B = V x A is the CS magnetic field. This equation is clearly the second quantized version 

of the Gauss law constraint obtained in the last two chapters (except that whereas in those cases 
p was the charge density, here p is the matter density, hence the extra factor of e in Eq. (124) 
compared to those cases). Now, in the weyl gauge = 0. Hence, one can invert Eq. (124) 
without any ambiguity and solve for the vector potential A. We obtain 

A\x) = j - y)p(y)) 


(125) 



Fractional Statistics and CS Field 


725 


where G is the two-dimensional Green function 

V^G(x-y) = <5(x-y) 


whose solution is well known to be 

G(x-y) = ^ln(p| x-y I) 


(126) 

(127) 


where p is an arbitrary scale. Thus j 4’ can be written as 


A^x) 


J d^ylnlx-y lp(y) 


dx^ [27r fjL , 


27r/i 


where (p is the winding (polar) angle i.e. 


^2 y2 

<^(x-y) = arctan(^j— ^). 


(128) 


(129) 


Note that while writing the second line of Eq. (128), we have used the Cauchy-Riemann equations 


d 




dx^ 


(130) 


It is worth pointing out that ^~jG(x— y) is ill-defined at x = y. Thus one has to supplement 
Eqs. (126) and (127) with a regularization prescription. One such prescription is 




C»J. 


:tG“(x) 


where the regulated Green function G^“) (x) is 

^ I x-y |)< 

This has the desired property that 

lim G^“>(x) = G(x) = ;^ In I X I 

a-~f0 27C 




(131) 


(132) 


(133) 


while for any a 

limeO— G(“)(x) = 0 
x->o dx^ 


(134). 


SO that once Eq. (132) is systematically used^ all ambiguities are eliminated. 

If one is now allowed to move the derivative operator outside the integral (128), then one could 
express A as a gradient. However, (p{x - y) is a multi valued function. Hence one must first fix 
a branch-cut in the |/-plane starting at a: so as to make it single-valued. No matter what choice 
is made for this cut, the resulting range of integration of y will depend on x and hence extra 
contributions are produced in moving d/dx^ outside the y integral. Thus, in general one can not 
write 



j Sy<p{x-y)p{y) 


(135) 


m that in general A is not a pure gauge and hence it cannot be removed by a gauge transformation. 
However, in the special case when p{y) is a sum of 5-functions, A(a:) is indeed a pure gauge. Such 
a situation arises in the case of non-relativistic localized point particles [59]. Let us assume that 
in the context of our non-relativistic model (121) too, p{^ is a sum of 5-functions in which case 
the CS gauge field is entirely determined by the matter configuration i.e. p and J. 



726 


Avinash Kbaie 


Thus, in the case of localized densities, A^{x) = -5;,A(x) i.e. the CS field is a pure gauge and 
hence it can be removed by the gauge transformation A^ — + = A^, + d^A = 0. Thus, under 

such a singular transformation, covariant derivatives turn into ordinary derivatives, and the action 
(121) becomes 

*?' = y (fx -i- dl)^} (136) 

where the new matter field ^ is defined as 

^(x) = , ^+(x) = . (137) 

The above action (136) is that of a free, complex, non-relativistic, scalar field However, we now 
show that such a field does not obey the conventional commutation relations ais satisfied by 

We can quantize the action (121) by imposing the equal-time commutation relations for the 
bosonic field t/? 

[ip{x,t), tp'^iy,t)] = S(x-y) (138) 

[?/j(x,t), i){y,t)] = 0 = [?/>+(x,t), (139) 

Since the gauge field A is a function of the number density operator p(= hence the com- 

mutator of A and '0 is not trivial. In fact using Eqs. (128) and (138) we obtain 

[A‘(x, t), ii(y, 0] = - y)V'(y) . (140) 

On using the regularized Green function as given by Eq. (132), it then follows by using Eq. 
(134) that [A^(x,t),^(x, t)] = 0. This is interesting because it means that there are no ordering 
ambiguities in the quantum theory as given by Eq. (121). 

One can now show that when ^ obeys ordinary commutation relations, ^ obeys 

^(x,t)t^(y,t) = e^’"'"^(y,t)0(x,t) (141) 

i.e. the matter field ^ obeys anyonic commutation relations of statistics a (= ^ jl'Kiji). If instead, 
we make a cut along the negative x'-axis, then we would obtain a phase factor (e'“*’^^), opposite 
to that in Eq. (141). Proceeding in the same way, it is easily shown that if x y then 

^(x,t)^+(y,t) =e”"^^^‘^(y,t)^(x,t). (142) 

It must however be noted that for x = y, the phase proportional to a vanishes and hence the 
canonical commutation relations remain unchanged. 

Some clarification is called for at this stage. What one has shown is that the fields ^(x, t), '^(y, t) 
satisfy anyonic commutation relations with the phase factor or depending on how we 
make the cut. However, this is not enough. What is really required is that the phase of the wave 
function changes both by + 7 ra and -Tra in response to which way we braid in interchanging x and 
y. No one has been- able to show this so far. In fact, what we have shown above is the best that 
one can achieve with local operators '0,'^’^. Local information, like initial and final positions of 
particles, is simply not sufiicient to code the braiding, where we also have to specify which way 
the particles passed around each other in interchanging their positions. As I see it, the only way 
to take care of this problem in this formalism is to choose such a definition of the multi-valued 
function <j) which will make -0 a non-local operator. 

Summarizing, it appears that within the non-relativistic field theory formalism, anyons can 
only be described by non-local operators, which are hard to deal with. If one insists on a local 
formulation, then one has to hide the statistics in an interaction with a CS field. 

There is no doubt that ideally the various effects of fractional spin, such as the spin-statistics 
theorem should be understood only in a full fledged relativistic quantum field theory. However, 
relatively little is known in this respect. The point is, if the fundamental fields are to carry 
fractional spin, they must carry a multi-valued irreducible representation of 50(2,1). This is 



Fractional Statistics and CS Field 


727 


because, a rotation of 27 r does not leave the Wave function invariant, but rather, it multiplies it 
by a phase . We then have the following tw'O options. 

The first option is that we define infinite component fields and from them construct one particle 
dynamics by imposing equations of motion that satisfy the requirement that one-particle states 
provide multi-valued Poincare equations. The most difficult part is the derivation of an action that 
reproduces these equations of motion. This requires handling a nonlocal theory and no one really 
knows how to quantize such a theory. 

The second option is to work with multi-valued fields by adding the CS term to the action 
and essentially repeat what we have done above for the non-relativistic case. Thus, instead of the 
non-relativistic model (121), one could consider a relativistic field theory, say a complex scalar 
field theory, coupled to an Abelian gauge field with a CS kinetic energy term (and no Maxwell 
term). Coming back to complex fields, one again wants to know if one can construct local quantum 
field theory where the fundamental fields represent the creation and annihilation of anyons. On 
proceeding exactly as in the nonrelativistic case, one again obtains Eq. (124). However, now the 
particles are not point particles but are extended objects, hence p{y) cannot be a sum of delta 
functions. Thus it is not possible to write A as a pure gauge and hence it cannot be removed by 
a gauge transformation. Thus, it is not at ail clear whether in the relativistic case the only effect 
of the gauge field is to endow the particle with arbitrary spin or if residual interactions are also 
present. A similar problem also arises in models which emerge from the relativistic theory in the 
non-relativistic limit. In particular, one obtains different results depending on which limit is taken 
first i.e. the size of the extended object going to zero vis-a-vis the regulator parameter going to 
zero. Attempts have been made to tackle these problems by quantizing the theory with CS term 
on a lattice with or without the Maxwell term. So far, th^ attempts have met with only a limited 
success. 

Thus it is fair to say that, so far we do not have a model in relativistic local quantum field 
theory where the fundamental (non-interacting) field quanta are themselves anyons. In fact it 
appears unlikely that one can obtain a simple, local (relativistic) Lagrangian for anyons. This is 
because, even in 2 4- 1 dimensions, spin has to be an integer or half-integer for local fields. On 
the other hand, fractional spin is admissible for fields which carry charges associated with gauge 
symmetries (with accompanying flux integrals at infinity) which are typically localizable only in 
space-like cones [32, 30]. This is what happens for example, when one generates fractional spin by 
coupling point particles to a CS gauge field which has non-triviaJ long-ranged properties. 


References 

[1] E.A. Abbot, Flatiand (Princeton Univ. Press, New Edition, 1991). 

[2] C.H. Hinton, An Episode Of Flatland (1907). 

[3] M. Gardner, in The Unexpected Hanging And Other Mathematical Diversions, ed. M. Gardner 
(Simon and Schuster 1969). 

[4] A.K. Dewdney, Two Dimensioned Science and Technology, J. Recreate. Math. 12 (1979) 16 ; 
For a short summary of the book see M. Gardner, Sci. Arne., July Issue (1980) 18. 

[5] For the details see for example, A. Khare, Fractional Statistics and Quantum Theory (World 
Scientific, Singe^re, 1997). 

[6] ItB. Laughlin, Phys. Rev. Lett. 50 (1983) 1395. 

[7] Fot a popular readable account see A. Khurana, Phys. Today 43(1) (1990) 19. 

[8] J.M. Ldnaas and J. Myrheim, Nuovo Cim. B37 (1977) 1. 

{9] R. Mirman, Nuovo Cim. B18 (1973) 110. 

[10] F. Haldane, Phys. Rev. Lett. 67 (1991) 937. 



728 


Avinash Kbare 


[11] For a detailed review of various properties see for example, A. Khare, Fort. der. Physik 38 
(1990) 507 ; Proc. Indian Natn. Sc. Acad. A61 (1995) 161. 

[12] W. Siegel, Nucl. Phys. B156 (1979) 135; J. Schonfeld, Nucl. Phys. B185 (1981) 157; R. 
Jackiw and S. Templeton, Phys. Rev. D23 (1981) 2291; C.R. Hagen, Ann. Phys.N. Y.) 157 
(1984) 342. 

[13] S. Deser, R. Jackiw and S. Templeton, Phys. Rev. Lett. 48 (1982) 975; Ann. of Phys. 140 
(1982) 372. 

[14] S. Coleman and B. Hill, Phys. Lett. B159 (1985) 184. 

[15] A. Khare, R.B. MacKenzie and M.B. Paranjape, Phys. Lett. B343 (1995) 239. 

[16] C. R. Hagen, P.K. Panigrahi and S. Ramaswami, Phys. Rev. Lett. 61 (1988) 389. 

[17] L.D. Landau and E.M. Lifshitz, Electrodynamics of Continuous Media, Second Edition (Perg- 
amon Press, Oxford 1963). 

[18] T.H. O’Dell, The Electrodynamics of Magneto-Electric Media (North-Holland, Amsterdam, 
1970). 

[19] A. Khare and T. Pradhan, Phys. Lett. B231 (1989) 178. 

[20] S.K. Paul and A. Khare, Phys. Lett. B193 (1987) 253, B196 (1987) E571. 

[21] R.D. Pisarski and S. Rao, Phys. Rev. D32 (1985) 2081. 

[22] G. Giavarini, C.P. Martin and F. Ruiz Ruiz, Nucl. Phys. B381 (1992) 222. 

[23] A. Khare, R.B. MacKenzie, P.K. Panigrahi and M.B. Paranjape, Phys. Lett. B355 (1995) 
236; L. Chen, G. Dunne, K. Haller and E. Lim-Lombridas, Phys. Lett. B348 (1995) 468. 

[24] A.N. Redlich, Phys. Rev. Lett. 52 (1984) 18 ; Phys. Rev. D29 (1984) 2366; A. J. Niemi and 
G.W. Semenoff, Phys. Rev. Lett. 51 (1983) 2077. 

[25] A.S. Schwarz, Lett. Math. Phys. 2 (1978) 247; E. Witten, Comm. Math. Phys. 121 (1989) 
351. 

[26] A.A. Abrikosov, Sovt. Phys. JETP 5 (1957) 1174. 

[27] H.B. Nielsen and P. Olesen, Nucl. Phys. B61 (1973) 45. 

[28] B. JuUa and A. Zee, Phys. Rev. Dll (1975) 2227. 

[29] S.K. Paul and A. Khare, Phys. Lett. B174 (1986) 420; B177 (1986) E453. 

[30] J. Frohlich and P.A. Marchetti, Comm. Math.-Phys. 121 (1989) 177. 

[31] E. Witten, Phys. Lett. B86 (1979) 283. ' 

[32] D. Buchholz and K. Fredenhagen, Comm. Math. Phys. 84 (1982) 1. 

[33] 

[34] L. Jacobs, A. Khare, C.N. Kumar and S.K. Paul, Int. J. Mod. Phys. A6 (1991) 3441. 

[35] S.K. Paul and A. Khare, Phys. Lett. B171 (1986) 244. 

[36] H.J. de Vega and F.A. Schaposnik, Phys. Rev. Lett. 56 (1986) 2564 ; Phys. Rev. D34 (1986) 
3206; G.N. Kumar and A. Khare, Phys. Lett. B178 (1986) 395, B182 (1986) E415; Phys. 
Rev. Lett. 59 (1987) 377 ; Phys. Rev. D36 (1987) 3253. 



Fractional Statistics and CS Field 


729 


[37] D.P. Jatkar and A. Khare, Fhi/s. Lett. B236 (1990) 283. 

[38] J. Hong, Y. Kim and P.Y. Pac, Phys. Rev. Lett. 64 (1990) 2230; R. Jackiw and E.J. Weinberg, 
Phys. Rev. Lett. 64 (1990) 2234. 

[39] S. Deser and Z. Yang, Mod. Phys. Lett. A4 (1989) 2123. 

[40] P. di Vecchia and S. Ferrara, Nucl. Phys. B130 (1977) 93. 

[41] C. Lee, K. Lee and E.J. Weinberg, Phys. Lett. B243 (1990) 105. 

[42] S.N. Behera and A. Khare, Pramana (J. Phys., India) 15 (1980) 245. 

[43] A. Khare, Phys. Lett. B277 (1992) 123. 

[44] A. Khare, Phys. Lett. B255 (1991) 393. 

[45] R. Jackiw, K. Lee and E.J. Weinberg, Phys. Rev. D42 (1990) 3488. 

[46] A. Khare, Phys. Lett. B263 (1991) 227. 

[47] D.P. Jatkar and A. Khare, J. Phys. A24 (1991) L1201; D. Bazeia, Phys. Rev. D43 (1991) 
4074. 

[48] R. Wang, Comm. Math. Phys. 137 (1991) 587. 

[49] J. Spruck and Y. Yang, Comm. Math. Phys. 149 (1992) 361. 

[50] S.K. Kim and H. Min, Phys. Lett. B281 (1992) 81. 

[51] N. Mantoir, Phys. Lett. BllO (1982) 54 ; Phys. Lett. B154 (1985) 397. 

[52] A. Khare, Phys. Rev. D46 (1992) R 2287. 

[53] C. Lee, K. Lee and H. Min, Phys. Lett. B252 (1990) 79. 

[54] R. Jackiw and S-Y. Pi, Phys. Rev. Lett. 64 (1990) 2969 ; Phys. Rev. D42 (1990) 3500. 

[55] S.K. Kim, W. Namgung, K.S. Soh and J.H. Yee, Phys. Rev. D46 (1992) 1882. 

[56] S.K. Kim, K.S. Soh and J.H. Yee, Phys. Rev. D42 (1990) 4139. 

[57] F. Waczek and A. Zee, Phys. Rev. Lett. 51 (1983) 2250. 

[58] Y.-S. Wu and A. Zee, Phys. Lett. B147 (1984) 325 ; A.M. Din and W. J. Zalcrzewski, B146 
(1984) 341. 

[59] See for example, A. Lerda, Anyons: Quantum Mechanics of Particles with Fractional Statis- 
tics, Lecture Notes in Phys. ml4 (Springer- Verlag, Berlin 1992) and references therein. 



27. Chern Simons Field and Composite Bosons in 
the Quantum Hall system 


R. Rajaraman * 

School of Physical Sciences Jawaharlal Nehru University 

New Delhi 110067, India 


Abstract 

This is a brief pedagogical review of the use of the Chern Simons vector field and the 
associated singular gauge transformations in the theory of Quantum Hall Effect. We will 
primarily deal with the Fermion to Boson transformation to construct an order parameter 
field to characterise the Hall plateau states. The canonical exact Hamiltonian of this field is 
derived. The construction is generalised to include certain crucial non- unitary transformations 
which yield the well known and very successful Laughlin wavefuction as the Mean Field ground 
state of the condensate. The adaptaion of the method to Jain’s composite fermion theory is 
also briefly mentioned. 


1 Introduction 

The Quantum Hall System (by this we refer to quasi-two-dimensional layers of electrons trapped 
in the interface of semiconductors, at very high magnetic fields and very low temperatures) has 
revealed many remarkable features. The most well known of these is the occurence of very special 
states at filling fractions u (defined as ^ , where p is the mean electron density and B the applied 
magnetic field) which are integers or certain odd denominator fractions. The special states at these 
filling fractions ui lead to remarkably flat plateaus in Hall conductivity for a range of fillings around 
these values, with the conductivity (in units of ^) taking values exactly equal to Vi to an accuracy 
of 1 in ! Associated with these plateaus in Hall conductivity is also a very steep drop in diagonal 
resitivity. These remarkable features are very universal in that they dont seem to depend on the 
details of the material, the extent (within some limits) and nature of the impurities etc. It was 
recognised that the electrons in these special Quantum Hall (QH) states form an incompressible 
fluid and Laughlin proposed a very simple set of trial wavefunctions to describe the ground states 
and lowlying excitations of these systems [1]. For some reviews of these phenomena and related 
basic theory, see Prange and Girvin [2], MacDonald, [3], Karlhede et al [4], and Stone [5]. 

That these special QH states correspond to some form of an ordering was first elucidated by 
Girvin and MacDonald [6] who showed that the Laughlin wavefunctions exhibit a non-trivial form 
of oflf-diagonal long range order. More precisely, they showed that there exists a composite operator 
of the fundamental Fermi fields that obeys Bose statistics and whose off-diagonal density matrix is 
algebraically long ranged in the Laughlin states [7]. This was an extremely important observation 
for it opened up the possibility of a Landau-Ginzburg description of the QHE in terms of an order 
parameter field, thereby bringing it within reach of more systematic computations. 

In the phenomenon of superconductivity, the bosonic order parameter field is contructed from 
a product of two electron operators, reflecting the physics that the phenomenon arises from the 
condensation of Cooper pairs. No such pairing seems to be indicated in the physics of the QH 
system . However in two space dimensions, there is another way of constructing Bosons out of 
Fermions, and that is by using the ’’anyon” or Chern Simons (CS) transformation, for the special 


Email: doug@jnuniv.ernet.in 



C-S Field in Quantum Hall system 


731 


case when the anyon angle is an odd multiple of tt [8] [9] . Physically this amounts to making a 
composite of the electron with an odd number of flux tubes which then behaves like a Boson under 
exchange. It turns out that this particular way of constructing a Boson does correctly yield the 
field that seems to be getting ordered in the plateaus of the QH system, as anticipated by Girvin 
and MacDonald [6]. 

The bulk of this review will be devoted to the quantum field theoretic procedure for constructing 
such Quantum Hall order parameter fields and describing their dynamics. Before we do that 
however, it will be useful to recapitulate the Anyon-Chern Simons construction at the first quantised 
level, which we describe in the next section. 


2 CS Interaction at the First Quantised Level 

The mechanism of altering the effective Statistics of particles in two dimensions by mutiplying 
them with appropriate phase factors, or equi\^ently making them interact with a statistical CS 
field has been w^ell known for about two decades (for reviews see Wilczek [10]) and Sumathi Rao 
[11]). Therefore we will not dwell on the details of the basic ideas and move on to its application 
to the QH problem, following the w’ork of Zhang, Hansson and Kivelson [12]. 

Consider a w^avefunction of two electrons in two dimensions, antisymmetric under the exchange 
of the particles : 


^(ri,f2) = -ip(f2,fi) ( 1 ) 

Define a new function by 

<^(n,r2) = ^(ri,f2) (2) 

where 0i2 is the angle (on the two dimensional plane) of the vector fi 2 = ri - ^ 2 - Since under 
the exchange 1 -H- 2, $12 -> (^12 + tt), the field (j> picks up a factor this amounts, for an 

arbitrary value of m to fractional or Anyon statistics . When m is an odd-integer, 4> corresponds 
to a pair of Bc^ns, while for even-integer values of m it continues to correspond to Fermions. For 
the present let us take m to be an odd-integer, so that <p(fi, fs) is a bosonic wavefunction. 

One can clearly generalise this to N particles. If % is the angle of the vector rij =z fi—Fj and 
fjv) represents the wavefunction of N fermions, then, 

4>{ri , f 2 rXr) = (exp{~ im 6ij 

^ i<3 

will, for odd m, be symmetrical under the exchange of any pair (ij). 

This phase factor will clearly induce an extra interaction in the Hamiltonian governing the 
function as compared to the one governing Let the fermionic wavefunction ^ be governed by 
a Hamiltonian 

N . 2 ^ 

Hf = A- +e^(a:i) + (4) 

where fi is the mass, Aq is some external electrstatic field, say , due to impurities, V(r-y ) is the 
in.<ierparticle interaction and 5 = V - j^A . Then the bosonic function ^ will be governed by 

/ • \ 2 

( 5 ) 

i ^ ^ ^ i<j 

where the statistical gauge field a is given by the gradient of the phase : 

a(r0 = ^mJ2Vi 9ij 
iFj 


>) 


0(ri,r2 rXr) 


( 3 ) 


( 6 ) 



732 


Rajaramm 


One can check that the vector field a is transverse 


V . a = 0 

( 7 ) 

and obeys 


Curl a(f) = = -m{hc(e)p{f) 

( 8 ) 

where p{r) , the density, is just 


^ S{f - T\) 

( 9 ) 


i 


Note that even though (3) looks like an innocous gauge transformation, it is singular. The associ- 
ated ’’gauge” field, the Chern Simons vector field defined in (6) will yield non zero magnetic field 
as evident from eq (8). 

In short, one can construct a bosonic (symmetric) wavefunction firom a starting fermionic one 
at the cost of adding an interaction with a statistical gauge field a. Note that with the constraints 
(7 and 8 ) a is not an independent field. It is fully determined by the sources at fi. This vector 
field a has come to be called the Chern Simons field because the constraint (8) can be encoded into 
the Lagrangian formulation by using terms in the Action of involving the Chern Simons 3-form: 


Scs 





Ke^'^^a^dtrax 


eao p 


( 10 ) 


2 

where k = and /z,cr and A vary between 0, 1 and 2. 

Notice from the definition of the filling fraction u that the external magnetic field obeys 


B = curl A = ^ 
eu 


( 11 ) 


which may be compared to eq(8) for Curl a. When the filling fraction happens to be z/ = 1/m where 
m is the odd integer used in the CS transformation (3), we can see that the CS field constraint (8) 
in the mean-field approximation p(f) = p becomes 


curl a = = -curl A (12) 

e 

while 

V . a = 0 = -V .A (13) 

Hence we can set a = -A in the Mean Field (MF) approximation. This in turn means that in 
the bosonic Hamiltonian (5), the covariant derivative becomes the ordinary gradient in th MF 
limit. Then in the absence of external elecrostatic fields and inter-particle interactions taken to be 
cancelled in the MF limit by a neutralising background, (i.e. take Ao = 0 = V{rij) ), the bosonic 
Hamiltonian has a ground state solution 




(14) 


which can be interpreted as a translationally invariant condensate of zero-momentum particles. 

Thus by using the phase transformation 3 and associated interaction with the CS field a one 
has not only gotten a bosonic wavefunction, but also ground state where this wavefuction is a 
constant in space( a condensate of the bosons) ! The corresponding wavefunction for the electrons, 
in this condensate ground state, obtained by inverting the CS phase transformation (3) will be 

= exp{imY^ Oij) (15) 

i<j 

At this stage we recall that, long before the above developments, we had the famous Laughlin 
wavefunction [1] which describes to great accuracy the QH ground state at filling fractions i/ = 1/m 
with m odd. This wavefunction is : 

i)Lin,f2...SN) = JliZi- Zj)'^ 
i<j 


( 16 ) 



C-S Field in Quantum Hall system 


733 


where Zj = is the complex coordinate associated with the point fi. 1 — y the magnetic 

length. If we compare our mean field solution (15) wdth this Laughlin wavefiinction we see that 
the former is preciesly the phase of the latter. 

That the CS method produces also the phase of the Laughlin wavefunction is another indication 
the the electron-CS-flux composite (the boson above) represents a good bit of the correct physics 
of the QH system at fractions u = 1/m. 

However, the modulus of the Laughlin wavefunction i[;i contains its all- important zeroes \{zi ~ 
Zj\^ when any two electrons approach each other, crucial for reducing the Coulomb energy between 
them. And the gaussian factors, also part of the modulus of 'ipL , are vital for its normalisibilty. 
Of course Zhang [13] did show that inclusion of fluctuations around the mean field solutions in the 
Random Field Approximation ends up reproducing the full Laughlin wavefunction including its 
modulus. But that means that fluctuations are not producing small corrections to the MF result. 
They are doing considerable violence to the MF wavefunction by scooping out holes in it when 
two particles approach one another , apart from introducing strong damping (gaussian) factors 
when any coordinate Zi becomes large. It would be nice if one could generalise the Fermion-Boson 
transformation in some way so that the the entire laughlin wave function (16) (and not just its 
phase) comes out already in the Mean Field limit. 

The kernel of such a possiblity lies in noting that eq(3) is not the only unique way to generate 
symmetric 'wavefunctions from anti symmetric ones. We can multiply the right hand side by any 
symmetric function of the coordinates and will still be symmetric. One could see if the choice 
of a suitable symmetric pre-factor can lead to the full Laughlin wavefunction corresponding to the 
MF condensate. Such a procedure was set up at the quantum field theoretical level by the present 
author and Sondhi [14]. We will discuss that in the next section. 

3 The Fermion-Boson Transformation in QFT 

Let $(f) be the full electron field operator obeying the equal-time anticommutation relations, 

{${x),$t(f')} = (17) 

The second quantized field theoretic Hamiltonian that describes our system is, 

H=^ jd^x $t(£) j j 1 y ^^(^) V{x-^) Spix') . (18) 

Recall that here 5 = V-i^.4 with ^being the vector potential of the uniform field (B = VxA(x)) 
and p{f) = is the density operator whose deviation from its mean value p is 5p[x). 

In order to recast this as a bosonic problem we need a set of canonical bosonic operators which 
we construct in the next sub-section. 

3.1 BosoHic Operators 

Consider the pair of operators $(f) and n(f), defined by 

$(f) = 

n(£) = (19) 

where, 

J{x) = mj d^x' [p(f') log(z - z')] - 1^, (20) 

and m is an odd integer. 



734 


Rajaraman 


It is clear that in the field theoretic version of the purely phase transformation (3) used by 
ZHK [12] , J would be chosen to contain only the phase of (z - z'), i.e. Im log (z-z’) in place of 
the log(z-z’) in our definition (20) and with no |zp term. Consequently their 11 = The field 
theoretic description of the ZHK work can be read off from the equations below' by making these 
replacements. We will proceed using the generalised definition (20) 

The Bosonic operator $ in (19) was in fact introduced first by Read [15] long before ref[14] as 
the appropriate order parameter field for thq Qu Hail ground states. He also proceeded to write 
the classical field equation obeyed by the expectation value of this Bose field under the Lowest 
Landau Level and other approximations. Here we adopt Read’s operator, but describe a full-fledged 
quantum field theory for it at the operator level without any approximations in the first instance. 
As a first step we have defined the field 11 in (19) as the canonical conjugate of $ . Eventually we 
will write down the quantum field (18) entirely in terms of Bose fields by changing variables from 
^ and to ^ and H. 

Evidently, ^ and H are not hermitian conjugates as J has both hermitian and anti-hermitian 
pieces; in fact, 

n(f) = . ( 21 ) 

Nevertheless, as we now show, they are canonically conjugate Bose fields. 

To this end note that the only operator appearing in J(£) is the electron density p{x) = 
’^'^(x)$(x), which obeys the commutation relation, 

[/)(f),’I^(f')] = x*) . (22) 

Using this one can obtain the following identities: 

$t(f^) = (z - z')^ «^^(f') . (23) 

It is then straightforward to verify using (23) that 

[$(f),$(f')] = [n(f),n(f^)] = o (24) 

while 

[^(f),n(f')]=(52(f-f') . (25) 

Thus, despite the presence of non-unitary factors in their definition in Eq. (19), the fields $ and 
n form a pair of mutually canonical Bose fields. However, in contrast to standard charged scalar 
field theories, here H is not equal to instead they obey the more complicated relation in Eq. 
(21). This fact, a consequence of the non-unitary transformation in (19), has to be borne in mind 
in doing manipulations with our theory. 

Nevertheless, notice that the fermion density p, when written in terms of H and ^ still has the 
standard bosonic form 

p(f) = ^t(f) ^(f) = n(£)$(f) . (26) 

Thus, if iV = / cPx p is the number operator, then 

[iv,n(f)] =n(x), (27) 

i.e. the operator 11 (x) creats one extra composite boson, and the number of composite bosons is 
the same as the number of the original fermions. 

3.2 The Hamiltonian 

Consider the action of the covariant derivative on the electron field. We have, 

D^(x) = D{€^^^H{x)) 

= (V - 



C-S Field in Quantum Hall system 

= (V - 



= {D - ^u(f)) $(z) 

(28) 

where. 


fir -* 

v{x) = i— VJ(i) . 
e 

(29) 

Hence, 


~ qJ (^ _ ij^v) $ 

(30) 


Inserting this into the starting Hamiltonian (18 ), and using Eqs. (19) and (26) we get, 


if = J‘(fxln(x)(^-^(V-^(A + iOf+eAo^$(x) 

+ \J J c(^2:d^x'5p(f) V{x — x')5p{x^) (31) 

This Hamiltonian, the auxilliary definitions (26) and (29) and the commutators (24, 25) together 
define a purely bosonic problem that is fully equivalent to our original fermion problem . 

The vector field v appearing in 31 above is constrained in terms of the density by Eq, (29), 
where 3{x) is defined in (20). Since this J(f) involves more than just the phase of {z - z'), this 
field V is not the field theoretic version of statistical Chern-Simon gauge field a used in the last 
section. Because J(f) has real parts, v is a complex vector field. However, we will see now that v 
is simply related to a. 

In field theoretic formulation the Chem-Simons field is defined as 

5 (f) = J 52 ^/ ^(^) jjjj - z') , (32) 

or equivalently 

6 = V X a = — Tn<f>Qp (33) 

where ^ is the flux quantum. Now, the function log z obeys the Cauchy-Riemann conditions 
away from z = 0 , which can be written as 

V(Relogz) = V(Imiog 2 :) x k (34) 

where ^ is a unit vector perpendicular to the plane. Using this we get, 

i;(f) = — VJ(i) 
e 

= ^ VxTTi j cfx' [p{^){BjelQg{z- z')+'ilm\og{z - z'))] - 1 ^ 1 - 

A % iJT 

= a(£) + i A: X ^ • (35) 

Note that the last term in the above equation is just a c-number term involving the coordinate 
vector E The density dependent operator part of v is present entirely through a. 


4 The Chern Simons Action and its Field Equations 

The constraint (33) relating a to the density p and the transversality condition can be implemented 
by the usual device of introducing a Lagrange multiplier fields in the action formalism. This action, 
as mentioned already, involves the Chern Simons 3-form ada (in form notation). In detail, the 
Action is 

dt H (36) 




736 


Rajaraman 


where H is the bosonized Hamiltonian in (31), and {jj. = 0, 1,2) is the 3-vector (ao,a). 

There is, however, a subtlety in this procedure which does not arise in ZHK’s construction and 
has to do with the gauge invariance of the resulting action and hence the freedom to pick gauges 
different from transverse gauge. First, note that the action is manifestly invariant with respect to 
gauge changes of the external field, i.e. the transformations, 

A A - —VA{x^t) 

e 

Aq Aq + —dtA{xjt) 

e 

n ^ e+zA(x,^) n ^ (37) 

However, the invariance with respect to gauge changes of the Chern-Simons field is more restricted. 
Gauge transformations of the form (37) with {Aq,A) replaced by (ao,a) leave the action invariant 
only if A(x, t) is independent of x, i.e. if they do not involve the spatial gauge field at all. In 
addition, there is a class of modified gauge transformations for the spatial components that do 
not involve the temporal component of the gauge field and have the following form. Let f{z) = 
u{x) + iw{x) be an analytic function of 2 . Then the action is invariant under, 


CL 

n 


— ♦ rtc s 

a - — Vw(x) 
e 

n . 


(38) 


(The variation of the gauge field implies that J J-f / which ensures that the constraint Eq, (21) 
is preserved.) The implications of this feature of our theory, in particular the significance of the 
modified gauge invariance (38) and its possible connection to work on Woo algebras [16], remain a 
subject for future work. 

The field equations arising from the Action (36) are a “non-linear Schrodinger equation” [17], 


{ifidt 

^2 

- e(oo + Ao))#(x) = 

Z€ 1 

V — —lA + a + ikxa — 
Tic V 



<J 

d^x'V{S-^) §(f) , 

(39) 

along with the modified Chern-Simons field-current identities, 



V X a 

= — m^oH $ 

(40) 


V . a = 0 

(41) 


^a—Vaoj = 

= 4 X J + VA) 

c ^ 

(42) 

where, 

-t _ n 

^ 2/ii 

n(o$)-(pn) f] 

(43) 


- ie / r ^ .T ^ \ 

(44) 


Although this current J does not look manifestly hermitian, it is in fact just the usual hermitian 
electron current operator, as can be verified by rewriting it in terms of the Fermi fields. 



C-S Field in Quantum Hall system 


737 


4.1 Exact classical ’’Ground State” Solution 

These field equations have a simple exact solution for the situation where the external electric 
potential .4o is absent and the uniform magnetic field B = V x ^4 is chosen so that the filling 


fraction is 



m 


(45) 


where m is the odd integer in the fermion to boson transformation function J defined in (20). The 
solution describes a homogeneous state and is given by the fields, 


m 

= n(f) = ^/p 


a^{x) 

= ’"f(xa) 


Go 

= 0 = A . 

(46) 


In order to verify that this is indeed a solution, we begin by noting that density p = 11# equals 
its mean value p everywhere. Hence dp is zero , removing the last term in (39). It follows that a 
constant # solves (39) provided the gauge fields that enters the covariant derivatives vanish. For 
the temporal gauge field this is trivially true. For the spatial gauge field we note that,* 

ap(x) = X x 

= --Bkxx 
2 

= -A, (47) 

and hence 

0 + 1=0 ( 48 ) 

This condition for picking out uniform states, that the Chern-Simons field at mean density 
cancels the external field 1 , is already known from [12]. But the statistical gauge field appearing 
in the covariant derivative in our Lagrangian and field equation (36 and 39) is not just A + a. It 
also contains imaginary pieces. However, we also have the additional the result that 

k X dp = 


This last equality tells us that the extra imaginary pieces of the statistical gauge field, i.e. the 
third and fourth terms in Eq. (35), also cancel one another. Altogether, we have, for p = p. 


2 

ch X 

TW 


(49) 


r .chx 

A + o^ + tfexop — t — = 0 


(50) 


and hence the forms (46) satisfy Eq. (39). 

It is also straightforward to verify that the forms (46) solve the field-current identity (3.5). 
Finally, readers concerned about the consistency of the solutions for $ and 11 should note that for 
our solutions J + Jf = 0. 


5 The Mean Field Wavefunction 

Tlie cla^ical solution (46) obtained in the previous section contains the boson field # which is space- 
time independent, has a uniform phase and a non- vanishing amplitude everywhere. This solution 
can be viewed as the ground state expectation value of an ideal condensate of the composite bosons 



738 


Rajaraman 


We now show that N>particle projection of this condensate ground state is, in the first quantized 
fermionic representation, exactly the Laughlin state. To see this note from Eq. (27) that in our 
bosonized formulation, an N-particle state is obtained by the action of N powers of 11 on the 
vacuum. Hence the translationally invariant ground state, where all the bosons have condensed 
into the fc = 0 mode, has the (arbitrarily normalized) form, 


W)mf = 


m 


/ 


-,N 


cPx E{x) 


\ 0 ) 


(51) 


where \0) is the ao particle (vacuum) state, and N,V oo with y = P- The first quantized 
electron wavefunction associated with this state is 


‘^MF {yi,y2, yN) 


{0\ $(yi) ^{yN) W)mf 

± (OieAvi) 

X J (fxill{xi) J (fx2E{X2) . 


eAvs) 

J (fxNE{xN)\0) . (52) 


Now we can use the identity, 

m) §(yi) (zi - zj)-, (53) 


to move all the factors of e^ to the left, which yields 


V’MF iyi,y2, vn) 


^ ]J(^i - zjr {o\e'^‘ m) 

* i<j 


cy d^XiIL{xi) J 


a^X2 ina:2 ) 


J 


— iV / 1'- 


(54) 


Next we use Wick’s theorem for the product of the ^’s and II’s and note that the former annihilate 
the vacuum to obtain 

^MF [yun, m ■ (55) 

i<j 


Finally, as the vacuum has no particles, only the gaussian factor in J contributes and we have the 
result, 

V’MF (fi.yz, Vn) = (56) 

i<j 

Thus the mean field state directly yields the complete Laughlin wavefunction. This is in contrast 
to the bosonized theory of ZHK, where the mean field state contains only the correct phase of the 
Laughlin wavefunction and not its zeroes or the gaussian factor, fluctuations axe included. 


6 Further Developments 

The main theme of this review , as indicated in the title, was the Chern Simons method for 
constructing bosons which are then shown to condense in the ground state at fillings i/ = 1 /m with 
odd m. We will conclude by briefly mentioning some closely related further developments, with 
references where fuller details may be found. 


i 



C-S Field in Quantum Hall system 


739 


6.1 Composite Fermions 

The bosons discussed above can be viewed as a composite of the original electron and an odd 
number (m) of fiuxons. These are sometimes called composite bosons. It is evident from the way 
the composite operators were constructed in eq (19) and their commutation property studied in 
eq (24) that if the integer m in (20) were even instead of odd, the composite operators would 
obey canonical anti-commutaion relations instead of commutation relations. We would thus have 
composite fermions instead of composite bosons whem m is even. 

While such composite fermions would clearly be of no relevance as candidates for an order 
parameter, they turn out to be very useful in a diflFerent way. It may be recalled that about 
10 years ago Jain [18] offered an understanding of fractional Qu Hall plateaus based on certain 
composite fermions made of even number of fiuxons and electrons. His theory addressed Hall 
plateaus at electron fillings u = where p and k are integers. He related these states to Qu 

Hall states of his composite fermions at integer fillings p. He also argued that the wavefunction 
of the fractional filling electron states can be related to the composite fermion 

wavefunction (f>p{zi^Z 2 , z^) by the equation 

= JJ(zi - ^p(zi,Z 2 , Zjv) (57) 

i<j 

# 

It was shown by Lopez and Pradkin [19] that these ideas, which Jain introduced through 
powerful intiution and physical arguments , can be also be systematically derived using the Chern 
Simons transformation we have discussed here, with an even value of the integer m = 2k. Once 
again Lopez and Fradkin used only the pure phase transformation involving the relative angle , 
akin to the ZHK work , and obtained , at the mean field level, only the phase of the Jain prefactors. 
However, a transformation of the kind described here, using the full exponent J(x) as in eq(20), 
was done by the present author [20] which yields all of Jain’s prefactors in the relation (57). (See 
also Wu and Yu [21]). 

A generalisation of the composite fermion picture for Quantum hall systems in double layers 
was also done by Lopez and Pradkin [22] and also by the author [20]. 

6.2 Filling Factor 1/2 

The case i/ = 1/2 does not correspond to one of the Hall plateaus. But at this filling many other 
interesting things happen. This case has been studied extensively and by different competing 
groups using different approaches. See for instance Murthy and Shankar [23] , Halperin et al 
[24], and references therin. We can examine this system using our Chern Simons transformation. 
Supper the transformation 19 were employed with the integer m = 2 at filling z/ = 1/2, Then, 
on the one hand the new composite field $ would be a fermi field, and at the same time in the 
mean field approximation the external electromagnetic potential would be cancelled by the Chern 
Simons field exactly as in eq(48). The composite fermions would then be free fermions, and could be 
described by a Perm! sea Slater determinant of plane wave states. The wavefunction of the system 
in electronic coordinates can then be written in this Mean Field approximation by mutiplying the 
Fermi Sea wavefunction (F5) by the prefactor for m = 2 to give 

^1/2 = (58) 

i<j 

Such a wavefunction was first proposed by Rezayi and Reeid [25]. 

6.3 Fluctuations 

Our main theme in this article was the setting up of a quantum field theory for a suitably defined 
order parameter field for the quantum Hall effect . The field theory admits mean-field states at 
the fractions u — 1/m that are ideal condensates in the Bose language and correspond exactly to 



740 


Rajaraman 


the Laughlin states in terms of the electrons. In order to treat fluctuations about these states and 
to calculate the spectra of the various collective modes it is necessary to perturb about the mean 
field Hamiltonian. In our formulation, the mean-field Hamiltonian has the simple form, 

Hmf = J cfxE{x) $(f) (59) 

and hence its eigenstates are all known exactly. Nevertheless, Hmf is non-hermitian and hence 
states with different energies are not necessarily orthogonal. (The full Hamiltonian is perfectly 
hermitian; however the mean-field theory dictates that we decompose it as the sum of two non- 
hermitian pieces.) This requires then, that the perturbation theory explicitly take account of 
the non-orthogonality and that we possess tractable expressions for the overlaps between different 
states . For some work in this direction see Wu and Yu [21]. 

It is a pleasure to thank Dr S.L.Sondhi for colloboration and numerous illuminating discussions 
on this subject. 


References 

[1] R. B. Laughlin, Phys. Rev. Lett. 50 1395 (1983). 

[2] ’’Quantum Hall Effect ” edited by R.E.Prange and S.M. Girvin, Springer, (New York), (1990). 

[3] ’’The Quantum Hall Effect: A Perspective”, edited by A.H.MacDonald , Klewer, (Boston), 
(1989). 

[4] A.Karlhede, S.A.Kivelson and S.L.Sondhi, ’’The Quantum Hall Effect - The Article” Lectures 
at the 9th Jerusalem Winter School on Theoretical Physics, (1992) 

[5] ’’Quantum Hall Effect” , (Ed M. Stone), World Scientific, Singapore, (1992). 

[6] S. M. Girvin and A. H. MacDonald, Phys. Rev. Lett. 58, 1252 (1987). 

[7] Or rather, there is a set of operators, each element of which is long ranged in a different state. 
For linguistic simplicity we generally refer to all the elements of the set in the singular, e.g. 
“the bosonic operator”. 

[8] J.M.Leinass and J.Myrheim, H Nuovo Cimento,37, 1, (1977). 

[9] F. Wilczek, Phys. Rev. Lett. 49, 957, (1982). 

[10] Prank Wilczek, ’’Fractional Statitistics and Anyon Superconductivity”, World Scientific , (Sin- 
gapore), (1990). 

[11] Sumathi Rao, ’’Anyon Primer”, in ’’Models and Techniques of Statistical Physics” ,M. Bhat- 
tacharya (Ed.), Narosa Publishers, (New Delhi), (1995) ; TIFR preprint TIFR/TH/92-18; 
hep-th-9209066. 

[12] S. C. Zhang, T. H. Hansson and S. Kivelson, Phys. Rev. Lett62, 82, (1989). 

[13] S. C. Zhang, Int. J. Mod. Phys. B 6, 25 (1992). 

[14] R.Rajaraman and S.L.Sondhi, Int.J.Mod.Phys.B 7 , vol.lO, 793, (1996). 

[15] N. Read, Phys. Rev. Lett. 62, 86 (1989). 

[16] S. Iso, D. Karabali and B. Sakita, Phys. Lett. B 296, 143 (1992); A. CappeUi, C. A. TVugen- 
berger and G. R. Zemba, Nucl. Phys. B 396, 465 (1993). 

[17] There is also a related equation for H. 



C-S Field in Quantum Hall system 


741 


[18] J.K.Jain, Phys. Rev. Lett. 63, 199 (1989); Phys. Rev. B 40, 8079 (1989). 

[19] A. Lopez and E. Fradkin, Phys. Rev. Lett. 69, 2126 (1992) ; A. Lopez and E. Fradkin, Phys. 
Rev. B 44, 5246 (1991). 

[20] R.Rajaraman, Phys.Rev B 56, 6788, (1997). 

[21] Y-S Wu and Y.Yu , ’’Field Theory of Vortex like Composite Fermions, preprint cond-matt. 
9608061 (1996). 

[22] A. Lopez and E. FVadkin, Phys. Rev. B 51, 4347 (1995). 

[23] G. Murthy and R.Shankar, Phys. Rev. Lett. 79, 4437 (1997); see also preprint cond matt 
9802244 (1998) 

[24] B. 1. Halperin, P. A. Lee and N. Read, Phys. Rev. B 47, 7312 (1993). 

[25] E.Rezayi and N Read, Phys. Rev. Lett. 72, 900, (1994). 



Part F : Methods Of Strong Interactions 
In QFT 

28. Hadrons Prom QCD - Achievements And Prospects by Olivier Pene 

29. QCD Sum Rules In Hadronic And Nuclear Physics by L.S.Kisslinger 

30. Light“Front Dynamics by V.A.Karmanov 

31. 3D-4D Interlinkage Of B-S Amplitudes - Unified View Of QQ And 
QQQ Dynamics by A.N.Mitra 

32. The Harmonic Oscillator In Quantum Theory - A Powerful Bridge In Physics 
by Marcos Moshinsky 




28. Hadrons from QCD: achievements and 

prospects 

O. Pene“ * 

“^Laboratoire de Physique Theorique ^ 

Universite de Paris XI, Batiment 211, 91405 Orsay Cedex, France 


Abstract 

We overview very briefly some main attempts to deduce the hadron properties from QCD. 
After a short reminder of quark models we recall the role played by the use of symmetries, 
effective Lagrangians and Wilson expansion. We then turn to lattice Monte-Carlo simulations, 
claim that for simple matrix elements their limitations are only technical and quote some of 
their important recent predictions. We indicate how lattices may establish a link between 
non-perturbative and perturbative QCD. Finally we throw a prospective look towards the 
expected achievements in the coming years. 


1 Introduction 

Quantum Chromodynamics (QCD) is today unanimously considered as the theory of the strong 
interactions. Prom the tiny compact formula of the QCD Lagrangian, 

Hqcd = ( 1 ) 

Q Q 

where 

+ gsfabcA^^Al, 

~ ^ij ids 

a 

one can derive successful predictions for an enormous amount of physical observations. 

But “there is many a slip twix the cup and the lip”. The full derivation of physics from QCD 
is often a hard task. Of course QCD was born when asymptotic freedom has been discovered and 
it was understood that one can use perturbation theory, Feynman graphs, to compute processes 
which involve large energies, large momentum transfers, in other words small spatial distances. 
Under this respect QCD is somewhat similar to Quantum Electrodynamics. 

Still, complementary to asymptotic freedom, is the small energy behaviour of QCD, more 
mysterious and original: confinement. QCD has this extraordinary specificity that the states of 
the theory, the hadrons (baryons and mesons), are not similax to the fundamental fields of eq. (1) 
namely the quarks and gluons. Any concrete process in QCD starts and ends with hadrons, even 
though any theoretical work starts from quarks and gluons. 

It is therefore mandatory to learn how to describe hadrons from quarks and gluons, to learn how 
to compute hadron properties: masses, matrix elements, etc. from eq. (1). And this is extremely 
difficult. The realm of QCD is currently divided into perturbative and non-perturbative domains. 
In the former, the main tool is Feynman graphs. The latter can be subdivided into first the many- 
body non perturbative processes, soft scattering, small x physics, the domain of Regge theory, and 


* E.maikpene® qcd.th.u-psud.fr 
Unite Mixte de Recherche - UMR 862 , 



746 


0 Pene 


second, the hadronic physics. The major tool in the latter is Monte-Carlo lattice QCD numerical 
simulations. However, this division of QCD into subdomains is largely artificial and illegitimate. 
The interconnections between perturbative and non-perturbative physics are omnipresent. As 
already stated, hard processes start and end with hadrons. Deep inelastic scattering incorporates 
perturbative coefficients and a non-perturbative structure functions, etc. 

QCD is one domain. We would like to explore this unity the other way round. Instead of 
starting form perturbative processes and wondering how one can reinsert hadrons, we would like 
to start form hadrons, overview the methods which have been used, from quark models, effective 
Lagrangians, symmetries, to lattice. We would like to discuss in what sense lattice is exactly QCD, 
and then consider how bridges are thrown from lattice QCD towards other fields. 


2 Quark models 

QCD orginated from quark models [1]: a “lego” game describing the hadron spectrum in terms 
of quarks. Very quickly it was proposed to describe hadrons as bound states of quarks satisfying 
some kind of Schrodinger equation, and dynamical properties, such as transition matrix elements 
of hadrons, w^here computed computed from these wave functions [2]. It was later realised that 
the ‘"partons” seen in deep inelastic scattering where indeed quarks, and finally [3] the discovery 
of the asymptotic freedom of the Lagrangian (1) started up real QCD. 

But quark models continued, mainly because there existed no other method to compute hadronic 
matrix elements. The discovery of charmonium opened a field where quark models can be in part 
derived from QCD. Many attempts [4] towards relativistic quark models have existed with good 
phenomenological properties. But none can be rigorously derived from QCD in a systematic way. 
Quark models remain models even though they are still indispensable today since it remains im- 
possible to study, with any other method, processes which involve orbitally or radially excited 
resonances. Quark models are also the only ones to provide a complete description of hadron 
spectrum in a semi-quantitative way. 


3 Symmetries and effective Lagrangians 

3.1 Flavor symmetries 

In view of the difficulty to derive hadron properties from QCD, the first move will be to derive on 
general grounds how much can be told simply from symmetry principles. The three light quarks 
u,d and s have mass differences ^ 150 MeV, sensibly smaller than the QCD scale^. Fiavor-5i7(3) 
symmetry results. Even better, the very tiny mass difference rrid - generates isospin symmetry 
which is satisfied up to 1 %. These symmetries are satisfied in the so-called “Wigner mode” i.e. 
the vacuum is invariant under isospin or flavor-5i/(3) rotations and the hadrons are organised in 
multiplets of SU{2) (resp SU{3)). This reduces the number of unknown parameters. To a good 
accuracy, it is enough to know the properties of the multiplets rather than those of each individual 
members of these multiplets. As an example, it is enough to know one invariant p -> tttt amplitude, 
to know dl strong decay amplitudes. Of course these symmetries do not provide any means 
to compute the invariant amplitude p tttt. 

3.2 CMral symmetries 

Chiral symmetry is not a symmetry of hadronic states due to spontaneous symmetry breaking. 
The vacuum is not invariant for chiral symmetry. A chiral rotation generates (annihilates) coherent 
Goldstone hoson states i.e. pions. As a consequence the symmetry tells us almost everything about 
the soft pion dynamics and yields the effective chiral Lagrangian which fully describes this sector: 

One can discuss endl^ly about what one calls the QCD scale which we shall call generically Aocd* Let us 
take arbitrarily Ci 1 GeV, the scale of the nucleon mass. 



Hadrons from QCD 


747 


£( 2 ) = 1 p2 Trid^Ud^U^), ( 2 ) 

where C/ is a SU{Nf) unitary matrix field 

U = exp , (3) 

with Ta the generators of SU{Nf), normalized as Tx{TaTt) = 26ab and [Ta,Tb] = 2ifabcTc Uabc 
being the structure constants of SU{Nf)). 

As can be seen from (2), all the non-perturbative uncertainty is encoded in the parameter 
which is furthermore experimentally very well known. 

This beautiful situation is unhappily limited to small pion energies. When the latter increases 
one needs higher terms in the chiral expansion, the number of operators and of unknown constants 
increases dramatically, not to speak of the renormalisation problems. 


3.3 Heavy quark effective symmetry 

The heavy quark symmetry applies when one quark is heavy (say a b) while all other quantas are 
light (light quarks, gluons, etc). We are then in an atom-like situation in which the heavy quark 
is almost fixed up to a small recoil motion: 

= mi)V^ -f (4) 

where mt is the heavy mass, = Pq/Mb the hadron’s four- velocity represents the residual 
motion of the heavy quark relative to the hadron rest frame and k^ = O(Aqcd)- 
The heavy quark propagator simplifies: 

imc^+l^ ^ V+1 ,5^ 

{rriQV + kY — ttiq 2v ■ k 


as well as the quark-gluon vertex: 


This can be done Formally by a change of field variables: 

Kix) = H^{x) = £*’"‘*''‘"'•^^ 6 ( 2 :) 

and a systematic expansion of the QCD Lagrangian in powers of Aqcd/^^ij leads to 

b{x) {Up - mb) b{x) = K{x) {W^vY) hv{x)+ 


( 6 ) 

(7) 


^ K{x) (Di) K{x) + h{x) {cr^^Gn K{x) (8) 

2m5 ^ 4m6 

However, life is not exactly so easy. First the heavy qu^k symmetry assumes that only soft 
gluons are exchanged by the heavy quark which is only approximately true. Second, the systematics 
of this two-scale approach needs some rigorous definition of the scale fx which separates small from 
large energy scales. Before considering further this question, let us mention a last effective theory 
of QCD. 



748 


0 Pene 


3.4 Large energy effective theory 

Consider for example B nli/ decay. The final pion has a very large momentum in the B rest 
frame, and the active final u quark, which originates from the b uli/ decay, has most of the pion 
energy, the spectator quark being soft. The Large energy effective theory (LEET) [6]-[7] expands 
the active light quark momentum analogously to the HQET case: 


p^ = En^-¥ (9) 

where = 0{Aqcd) ss in (4), E is the pion energy in the B rest frame, i.e. E = v.p^r where 

is the B four velocity, is defined so that = En^ with v.n = 1 . Notice that = ml/E^ 0. 

There exists a SU{2) symmetry associated to LEET, but it is not a symmetry of the hadronic 

states. The reason for that is different than in the case of chiral symmetry: it results from the 
fact that, even if the active quark takes almost ail the momentum of the final pion when it is 
produced, the strong interaction, after a time ~ Aqcdj will share the pion momentum between 
the pion constituents, LEET is then lost, i.e. eq. (9) is no more valid. Still LEET produces very 
useful relations betw^een between B 7r{p)li/ form factors [7]. 


4 Wilson operator expansion 

4.1 QCD versus HQET 

We have left the HQET with the unanswered question: which is the scale /z which separates 
smali/large energy scales ? 

Let us consider the example of heavy to light current corrected to one QCD loop. 

(Mqcd ^ ^ 

where A is an infrared regulator, a '‘gluon mass”. Notice that eq. (10) depends on the heavy quark 
mass m^. Since the HQET Lagrangian, i.e. the first term in the rhs of (8), does not depend on 
the heavy mass, the one loop gluon correction in HQET must differ form the one in QCD. Indeed: 

{57/i*r)HQET 2? 

To reconcile (10) and (11) one expresses the QCD matrix element as the following Wilson [8] 
operator expansion: 

(^7m^)qcd ~ ^b) (?7/iAii)jjQgrp + C2(/z, TTii,) (12) 

with 

Ci(//,m6) = l + ^ (13) 

and 

= ^ + (14) 

As apparent from (12), the Wilson expansion has separated the two scales in this process: the 
dep«idenre in large momentum scale, mj, is in the coefficients Ci,C 2 while the dependence on 
the QCD scale is in the HQET operators. While the coefficients can be computed in perturbation 
theory, the HQET matrix elements need a non-perturbative calculation. 



Hadrons from QCD 


749 


4.2 Wilson expansion for inclusive processes 

In the preceding subsection we have shown the usefulness of Wilson expansion in exclusive pro- 
cesses, since HQET applies mainly to them, but it has also been fruitfully applied to inclusive 
B decays [9], Let us consider for example the inclusive decay B Xli/. As in deep inelastic 
scattering, we need to compute the imaginary part of the T-product of two currents. It may be 
Wilson expanded 


whence 


r{B -4 Xlu) oc j B\T{J^{x)J>^{0)) \B >] 


G%ml. 


r{B-^Xh) = ^\VcKM\x 


Csifj) 


< B\bb{0)\B >(^) 
2Mb 


+ 


csifi) < B\ba^^G'^''b(0)\B >(^) 
ml 2 Mb 


+ ... 


(15) 


Wilson expansion also has some drawbacks: it introduces order by order a fast increasing 
number of operators which have to be computed non-perturbatively and often raise renormalisation 
problems. 


5 Lattice Monte-Carlo simulations 

5.1 The principle 

We have seen in the preceding sections how symmetries, effective Lagrangians and Wilson ex- 
pansion can simplify the problem of computing non-perturbative quantities in QCD, reduce the 
number of non-perturbative unknown and separate the large energy scales, amenable to perturba- 
tive treatment, from the QCD energy scale. 

But we are still left with the problem of computing from first principles, i.e. from QCD and 
only QCD, the non perturbative matrix elements of QCD or of the effective theories considered 
above. 

The answer is well known, but unhappily it is only a numerical solution: Lattice Monte-Carlo. 
The principle is to discretize space and time [10], using lattice spacings typically of the order of 
a ^ 0.03 0.1 fm. Numerical calculations also need finite volumes: typically L ^ 1.5 3 fin. 

Last but not least, one performs an analytic continuation to imaginary time (Euclidean metric). 
It results that the field theory reduces to a Computable 4-D statistical system to which standard 
thermalisation algorithms can be applied [11], The standard lattice QCD action is [10] 


s\u\ — — 2 

-lE [^^GU^)Gr{x) 


where 


0,(1) 




Jago 


U^{x) e SU{Z) and gauge transformation writes Up,{x) -4 g{x)U^{x)g 4- afi). 

Let us skip the methods to introduce quark fields [12]. It is to be noted that the calculation 
of the quark determinant is very lengthy and computer-time consuming. Therefore the quark 
determinant is often ommited. This is called the “quenched approximation”. It is equivalent 
to neglect inner quark loops in Feynman diagrams. It is somehow equivalent to consider only 



750 


0 Pene 


constituent quarks in hadrons, neglecting see quarks. This approximation introduces a systematic 
uncertainty difficult to estimate with full confidence except by comparing with the full lattice QCD 
calculation, which includes the quark determinant. The latter type of calculation is usually referred 
to as using “dynamical quarks”. 

5.2 Some very relevant results 

After 25 years of lattice activity the number of results is overwhelming. A huge amount of phe- 
nomenological results have been obtained in the quenched approximation. Let us quote some 
salient results from the section C in the BABAR physics book [13]. These are updated world 
averages. The heavy meson decay constants are quoted in table 1 to which one should add the 
phenomenologically crucial result 


=2\QmMeV. (16) 

The renormalisation group invariant K — K mixing parameter: 

= 0.84 ±0.07 ±0.12 (17) 

the last error being the quenching error. In table 2 the D meson decay form factors are given. We 
do not quote the B meson decay form factors as the systematic error is still too large. We will 
return to this shortcoming later. 

One must still be aware of the limitations. The masses and momenta in these calculations need 
to be smaller than the ultraviolet cut-off (the inverse lattice spacing) and larger than the infrared 
cut-off (the inverse length of the lattice): mg a~^ i.e. 50 MeV <C mg <C 4 GeV. 

This is why Wilson expansion is extensively used. It is nowadays difficult to implement two 
different energy scales in the lattice. One computes on the lattice the hadronic matrix elements at 
a scale of the order of Aqcd? 2 nd the bridging with the large energy or mass scale is performed in 
perturbation when computing the Wilson coefficients. 

The price to pay is that higher terms in Wilson expansion introduce a large number of operators, 
the matrix elements of which have to be computed after they have been appropriately renormalised, 
sometimes a formidable task. 

Else one can try extrapolate towards the large energy scale. For example the B meson decay 
properties are extrapolated from quarks with a mass of the order of 2 to 3 GeV. This method 
induces large uncertainties and does not allow to reach large momenta of the decay product. 

Another weakness of the present situation is that most of these results are quenched. Some 
preliminary results using dynamical quarks exist, but they still have large errors and are not yet 
very useful for phenomenology. 

Notwithstanding these limitations, it should be strongly stressed that these are only technical 
limitations. We mean that, had we an unlimited computing power and memory space one could 
predict non-perturbative matrix elements within QCD to any accuracy^ at least as concerns matrix 
elements with no more than one hadron in the initial and final state: when several hadrons exist 
together, final state interactions take place, and this makes the analytic continuation back from 
imaginary time to real time rather difficult. 

Let us only mention in passing, although it is a very active and interesting field, the calculations 
of QCD at finite temperature, as well as the studies of say, chromo-electric flux tubes, and other 
quantities which give an insight into the confinement mechanism. 


5.3 Bridging the lattices with perturbative QCD 

In section C of [13] one also finds some lattice calculations of fundamental QCD parameters such 
as quark mass^: 

fn,(2GeV) = 110(2) MeV from mK m^(2GeV) = 133(6) MeV from 


(18) 



Hadrons from QCD 


751 


fa 

/o. 

Ib 

fB. 

200 ± 30 MeV 

220 ± 30 MeV 

170 ± 35 MeV 

195 ± 35 MeV 


Table 1: Heavy meson decay constants from lattices (world average). 



D iff*) 
lattice 

D /ft*) 
expt 

D ir,p 
lattice 

/+(0) 

0.73(7) 

0.76(3) 

0.65(10) 

V(0) 

1.2(2) 

1.07(9) 

1.1(2) 

Ax (0) 

0.70(7) 

0.58(3) 

0.65(7) 

A2(0) 

0.6() 

0.45(5) 

0.55(10) 


Table 2: Heavy meson decay form factors from lattices (world average). 


mt = 4.15 ± 0.05 ± 0.20GeV (19) 

In [14] and [15] the strong coupling constant of QCD is computed from lattices for zero quark 
flavors. In [14] the Schrodinger functional is used while in [15] this is done from the Green functions. 
The lattice spacing is given from the p meson mass, i.e. from a non perturbative quantity, while 
is computed at scales up to 5 GeV in [15] or even 100 GeV in [14]. The result converted into 
MS scheme for comparison with other estimates, 

Ag|=o) ^ 238(19)MeV (20) 

from Schrodinger functional, 

a(^=o) ^ 295(5)(15)MeV (21) 

from Green functions. 

Figure 1 shows the resulting ols {p) from [15] compared with the three loops perturbative result. 
The agreement down to 2.1 GeV is startling. Below that scale non perturbative physics settles 
in. As expected, as is finite everywhere, i.e. the Landau pole disappears as it is an artifact of 
perturbative expansion. Here lattices achieve something new and important: they prove to be 
able to relate perturbative and non-perturbative physics, they confirm deeply the unity of both 
domains. 

Let us now consider the coupling constant for momenta smaller than 2.0 GeV. It is not possible 
to compare this to well established knowledge. It is however bewildering that the Milan group [16], 
studying the phenomenology of multi-particle observables find non-perturbative observables which 
depend on the strong coupling constant in the domain below 2 GeV which turns out to agree quite 
well with the result in figure 1. More work has to be done in this field but this remark opens new 
and interesting perspectives. 

5.4 Some future prospects 

The lattice estimates of B decay parameters such as f^Bs and B tt, p form factors, of if - K 
mixing, etc., have been intensively used simultaneously with experimental data to constrain the 





752 


a 


0 Pens 


P=:6.2 ; Volume=24'* 



Figure 1: as computed from Green functions on lattices in the MOM scheme at = 6.2 for a 
24"^ lattice. The solid curves corresponds to the three loops perturbative alpha computed with 
Ams = 314 


standard model parameters, and in particular the geometry of the so-called unitarity triangle. 

However, the present theoretical accuracy hardly matches the current experimental precision. 
The latter will quickly improve thanks to the presently starting experiments: BABAR, BELLE, 
not to speak of Tevatron and the future B physics at LHC. 

Such an improved experimental accuracy will be useless if theory does not match. Beauty 
phenomenology uses extrapolations up to the b mass, since the latter overcomes the present ultra- 
violet cut-offs (a~^). As a result some additional uncertainties arise and the accessible domain 
is limited. To measure V^b we need real b quarks, i.e. a C 0.04 frn, i.e. 100^ lattices if we want 
to keep a total length of a few fermis. In other words w^e need to be able to implement directly 
two scales on the lattice: 0(m&) and O(Aqcd)' 

On the other hand we would like to reach with dynamical quarks an accuracy comparable to 
today’s quenched results. 

This demands ^ 10 teraflops. 

As a bonus, using two scales on the lattice, new attempts could be tried. For example we might 
also try to compute on the lattice inclusive decays, like B Xlu, e + N -4 e -f AT, i.e. structure 
functions, shape functions, etc. 


References 

[1] M Gell-Mann, Phys. lett. 8 (1964) 214; G. Zweig., CERN preprint TH401 (1964). 

[2] A.N. Mitra and M. Ross Phys. Rev 158 (1967) 1630. 

[3] D.J. Gross and F. WUczek Phys. Rev Lett 30 (1973) 1343; H.D. Politzer Phys. Rev. Lett. 30 
(1973) 1346. 

[4] A.N. Mitra, S. Bhatnagar, I. Santhanam (Delhi U.). Aug 1991. In *College Park 1991, 
Hadron’91* 302-308; A. Le Yaouanc, L. Oliver, 0. Pene et J.-C. Raynal, Phys. Lett. 365, 
(1996) 319. 

[5] For recent reviews see .M. Neubert, Lectures given at International School of Subnuclear 
Physics; 34th Course: Effective Theories and Fundamental Interactions, Erice, Italy, 3-12 Jul 




Hadrons from QCD 


753 


1996, hep-ph/9610266; Invited talk at International Europhysics Conference on High-Energy 
Physics (HEP 97), Jerusalem, Israel, 19-26 Aug 1997, hep-ph/9801269- 

[6] M. J. Dugan and B. Grinstein, Phys. Lett. B255, 583 (1991). 

[7] J. Charles, A. Le Yaouanc, L. Oliver, 0. Pene, J.C. Raynal, Phys. Rev. D60 (1999) 014001; 
Phys. Lett. B451 (1999) 187. 

[8] K.Wilson Phys. Rev. 179 (1969) 1499; D3 (1971) 1818; W. Zimmermann, Ann. Phys. 77 
(1973) 536 and 570. 

[9] See for example 1. Bigi, M. Shifman, N. Uraltsev Ann. Rev. Nucl. Part. Sci. 47 (1997) 591. 

[10] K.G. Wilson, in “New Phenomena in Sub-nuclear Physics”, ed. A. Zichichi, Plenum, New york 
(1977). 

[11] See M. Creutz, Quarks, Gluons and Lattices, (Cambridge Univ. Press 1983) and references 
therein. 

[12] J. B. Kogut and L. Susskind, Phys. Rev. Dll (1975) 395; J.L. Alonso, Ph. Boucaud, J.L. 
Cortes and E. Rivas Mod. Phys. Lett. A5 275 (1990). D.B. Kaplan, Phys. Lett. B288 (1992) 
342. For a review see J. Smit Nucl. Phys. B (Proc. Suppl.) 4 (1988) 451. 

[13] THE BABAR PHYSICS BOOK: Physics at an asymmetric B factory: BaBar collaboration. 
Edited by P.F. Harrison and H.R. Quinn., 1998. (SLAC-R-504)QCD201:B3:1998. 

[14] ALPHA Collaboration (Stefano Capitani et al.) Nucl. Phys. B544 (1999) 669. 

[15] P. Boucaud, J.P. Leroy, J. Micheli, 0. Pene , C. Roiesnel, JHEP 9810 (1998) 017; JHEP 9812 
(1998) 004; D. Becirevic, P. Boucaud, J.P. Leroy, J. Micheli, 0. Pene, J. Rodriguez-Quintero, 
C. Roiesnel: hep-ph/9903364. 

[16] Yu.L, Dokshitzer, G. Marchesini, B.R. Webber: hep-ph/9905339. 



29. QCD Sum Rules in Hadronic and Nuclear 

Physics 

Leonard S. Kisslinger * 

Department of Physics, 

Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A 


Abstract 

The Method of QCD sum rules was formulated about two decades ago. It has been a 
valuable tool for the study of nonperturbative QCD in a large variety of applications. We 
first review the method as formulated for the treatment of hadronic masses using two-point 
correlators, with the introduction of the O.P.E and various vacuum condensates. Recent 
work on glueballs and hybrid hadrons, and the proposed low energy scalar glueball/sigma is 
discussed. We then review the use of the sum rules for hadronic couplings and form factors, 
which require three-point functions. Vacuum susceptibilities of the external field method 
and the use of three-point functions to evaluate them are discussed. The role of nonlocal 
condensates for form factors and reactions, and light cone sum rules for form factors and wave 
functions are reviewed. The application of the Dyson-Schwinger formalism for self-consistent 
studies and its usefulness for sum rule calculations is discussed. The use of instanton solutions 
for the quark propagator in QCD sum rules is reviewed and the present status of the need for 
confining guonic effects along with instantons is discussed. Hadrons in nuclear matter and the 
present status of the attempts to use QCD sum rules to predict mesonic properties in matter 
at finite temperature as the chiral phase transition is approched and exceeded are reviewed. 

It is seen that new information on four-quark condensates is necessary for both programs. 

Contents 

1 Introduction to Sum Rules 756 

1-1 Dispersion Relations 756 

1.2 Quantum Chromodynamics 757 

1.3 Quark and Gluon Propagators ^ Vacuum Condensates 758 

1.4 QCD Sum Rules 759 

1.4.1 Finding Solutions to QCD Sum Rules 760 

2 Sum Rules For Masses: Two-Point Functions 761 

2.1 Light-Quark Meson and Baryon Masses 761 

2.1.1 Meson Masses 761 

2 . 1.2 Factorization of Four-Quark Condensates 763 

2.1.3 Heavy-Light Quark Mesons 763 

2.1.4 Light-Quark Baryons 763 

2.2 Glueballs, Scalar Glueball/Sigma 764 

2 . 2.1 Glueballs 765 

2.2.2 Scalar Glueball/Sigma 766 

2.3 Mixed Scalar Glueballs and Mesons 766 


* Email:kissling4*@aiidrew .cmu.edu 



QCD Sum Rules 


755 


2.4 Hybrid Hadrons 767 

2.5 Isospin Mass Splitings; Gauge Invariant Electromagnetic Effects; Electromagnetic 

Penguins 768 

2.5.1 Isospin Splittings of Charm and Bottom Mesons: Gauge Invariance 769 

2.5.2 QED Penguins and (B,D) I-spin Mass Splittings 770 

3 Sum Rules for Coupling Constants: Three-Point Functions 771 

3.1 External Field Method: Two-Point Correlator 771 

3.1.1 Electromagnetic field: Nucleon magnetic Dipole Moments 771 

3.1.2 Other Fields and Coupling Constants 772 

3.2 Three-point Function Method for Vacuum Susceptibilities 772 

3.3 Pion-Nucleon Strong and Weak Coupling 774 

3.3.1 Pion- Vacuum Correlator for 774 

3.3.2 External Pion Field for g^^r 774 

4 Form Factors and Reactions 775 

4.1 Form Factors and Nonlocal Condensates 775 

4.1.1 Pion Wave Function 775 

4.2 Light-Cone Sum Rules 776 

4.2.1 Pion Form Factor and Light-Cone Sum Rules 776 

4.2.2 Pion Wave and Light-Cone Sum Rules 777 

4.3 Deep Inelastic Scattering; Quark, Sea-Quark Distributions 777 

4.3.1 Sea-Quark Anisotropy 778 

5 Dyson-Schwinger, B-S Formalism and Sum Rules 778 

6 Instantons and QCD Sum Rules 779 

6.1 Instanton Quark Propagator and QCD Sum Rules 780 

6.2 D-S Study of Instanton Quark Propagator 781 

7 Hadrons in Nuclear Matter 782 

7.1 Nucleons in Nuclear Matter . 782 

7.2 Other Baryons in Nuclear Matter 783 

7.3 Mesons in Nuclear Matter 784 

8 Mesons in Finite Temperature Matter 784 

9 The Pomeron 787 

10 Outlook 788 

11 Acknowledgements 


789 



756 


L S Kisslingei 


1 Introduction to Sum Rules 


The first sum rules were relations associated with the scattering of photons from electrons. 
These are the Kramers-Kronig[l, 2] dispersion relations, which relate the real to the imaginary 
parts of the dielectric constant. Such relationships apply to physical scattering amplitudes. Since 
the imaginary part of a scattering amplitude is related to a total cross section by the optical 
theorem, a dispersion relation for a scattering amplitude is a sum rule: it relates a scattering 
amplitude at one energy to well-defined integrals of total cross sections. 

In field theories there are dispersion relations for many important quantities, such as propaga- 
tors and vertex functions. These relations are sum rules. If one can use the field theory to evaluate 
appropriate quantities in the dispersion relation one has very useful sum rules. In this section we 
review the generad ideas for QCD sum rules. 

!•! Dispersion Relations 


The dispersion relation for the dielectric constant, e(a;), where uj is the frequency, is derived 
by recognizing that e is analytic in the upper half of the complex to plane and satisfies e(a;) = 
It is 


Re[eiu)] 


Im[e{u)] 


1 + / 

Jo 

^pf 

^ Jo 








( 1 ) 


where P represents a principal part integral. 

For scattering amplitudes, causality ensures the analyticity in the upper half plane and one can 
derive dispersion relations for scattering amplitudes. One of the early applications of this causal 
principle for scattering amplitudes^] showed that for photons scattering from protons, neglection 
spin, the forward scattering amplitude, f(a;) satisfies the relation 




M2 27r2 (w')2 - a;2 ’ 


( 2 ) 


where one has made use of the optical theorem, Im[f(a;)] = {iu^/47r)a(u;')^ with a is total cross 
section, and the relation that lle[f(0)] = -e^/M^. 

For field theories the disi>ersion relation for the forward scattering amplitude is of the form 

«“) = + (3) 

Where n is the number of subtractions and R(w) is a pol)Tioinial or order n in the cu variable. For 
example, the forward photon amplitude, Eq(2) is a once-subtracted dispersion relationship. The 
Im[f] is the spectral function. As an example, if there is a stable bound state at energy cj = Mh 
that is in the physical energy region of the scattering the spectral function would be given by a 
form such as that shown in Fig. 1. Note that in this case at M//- the sp>ectral function would have 
a delta function and the forward scattering amplitude a pole. 


Im[f(E}] = m{E-MB)+Im[fiE)] 

- +lf 

ir{MB-E) kJ 


m = 


dE' 


cont 

(E' - E) ’ 


where Im[F"**] is the continuum part of the spectrum. 


(4) 



QCD Sum Rules 


757 


Im[f(co)] 



Fig. 1 Spectral function, is the continuuna threshold 



Fig. 2 Elements of Feynman Diagrams for QCD 


1.2 Quantum Chromodynamics 


Here we give a very brief review of Quantum Chrmodynamics (QCD), a nonabelian gauge 
field theory. The fields are the quark fields, q'j{x) with color a and flavor f, and the gluon field, 
Afi{x) = JO® A’i^t‘^/2, where is an SU(3) color matrix. The Lagrangian density is 

L<^OD ^ 

= d^A0-d^Aa-ig[A’^,A^]. (5) 

The elements of Feynman diagrams used in perturbation treatments of QCD are shown in Fig. 2. 

Although QCD is a gauge theory, like QED, it has the property of “antiscreening”, which leads 
to q number of flavors. In this one-loop calculation the coupling constant becomes infinite at = 
A^. QCD at low and medium energies is nonperturbative. This makes the subject most interesting 
and leads to new fundamental phenomenon. 

The two systematic ways of studying the structure of hadrons at the present time are numerical 
lattice gauge calculations and the method of QCD sum rul^. In the method of QCD sum rules 
most of the calculations have used the vacuum condensates in a phenomenological treatment of 
nonperturbative QCD effects. Another approach to nonperturbative QCD is instanton models, 
which are consistent with the low^t-dimension condensates, and recently instanton effects have 
been included in some QCD sum rule calculations, which we discuss in Sec. 6. Next we discuss 
the nonperturbative quark and gluon propagators. 




758 


L S Kisslinger 


<qq> 

quark condensate 


X1o-_ 

<gq o Gq> 
mixed condensate 


XXXXXXX) C^OOOOOCX 

<G ■ G> 
gluon condensate 


Fig. 3 Vacuum condensates 


1.3 Quark and Gluon Propagators, Vacuum Condensates 


If i>{x) is a Dirac field for an elementary particle with mass m, the perturbative Dirac propagator 
is given as 

= j < 6|r[^(x)^(0)]|0 >, 


where T is the time-ordering operator, p = Ep7^P/x and by |0 > we mean the perturbative 
vacuum, in which all normal ordered products vanish. Using the notation |0 > for the physical, 
nonperturbative vacuum, the quark propagator is 

5,(p) = j 

S,{x) = < 0|r[g(x)g(0)]|0 >, (7) 

where q(x) is the quark field (with some flavor and color). For the physical vacuum the quark 
propagator Sq{x) has a perturbative part, 5^(x), and a nonperturbative part, 

5,(i) = 5f^(x)+ 5f^(ar). (8) 

In the case of vanishing current quark masses (m^ = 0) one can write 




1 X 
27r^ 

(-):^ (<: qix)q{0) :> +x^ <: qix)'r'^q{0) :>) . 


(9) 


It should be stressed that normal-ordered products do not vanish in the nonperturbative vacuum, 
and therefore and therefore must be considered.' For short distances, the O.P.E. for the scalar 
part of Sq^{x] gives 


<: q{x)q{Q) :>=<: q{Q)q{0) :> - j < 0| : g(0)a • G(0)g(0) : |0 > + . . . , (10) 

in which the local operators of the expansion are the quark condensate, the mixed condensate, and 
so fcwrtt. Th^ tamos in the quark propagator are illustrated in Fig. 3. The sum rules use Feynman 
diagrans for the perturbative calculations and Feynman-like diagrams for the nonperturbative 
precedes. Fig. 3 the notation for the elements of the nonperturbative processes up to 
dimaismn D=5 for use in the Feynman-like diagrams. Since the vacuum condensates are vacuum 
matrix dements of local operators they are gauge-independent and their values can be determined. 
Fils to expenm^ts have given the following phenomenological values for the vacuum condensates: 
a = -(2x)^ <: g(0)g(0} 0.55 GeV^, (27r)^ < 0| : q(Q)cr - G(0)g(0) : [0 >= m^o, with cif 



QCD Sum Rules 


759 


O.SGeV^. The quark condensate was evaluated using PCAC[4], and the Gell-Mann- Oakes-Renner 
relation[4], 


— 2 f‘^TnP‘ 

<0luu + dd|0> - (11) 

ruu -\-md 

is still an important relation for and consistent with QCD sum rule phenomenology. The systematic 
treatment of higher-dimension condensates through the operator product expansion[5] is one of the 
basic elements of the QCD sum rule method, which we discuss in the next subsection. 

In a similar way the gluon propagator is defined by the two-point function of the gluon field. 
The lowest-dimension vacuum condensate for pure glue is the gluon condensate, < g^G -G >=<: 

:>. The phenomenological value of the gluon condensate is < g‘^G • G 0.47 GeV"^. 

1.4 QCD Sum Rules 

In this subsecton we give a brief overview of the QCD sum rule method for obtaining hadronic 
masses by two-point correlators. Extensions of the method for coupling constants, form factors 
hadrons in nuclei and so forth will be given in later sections, 

A QCD sum rule is a relation between a dispersion relation expression for a correlator and a 
microscopic QCD evaluation of the correlator. A correlator is defined as a propagator, Eq.(6), in 
which a composite field operator, ^^(x), is used for a hadron in contrast to ^(x), the field function 
for a basic particle. The composite field operator, usually called a current, must have the property 
that it can create the hadron under study firom the vacuum; i.e., the current fi>r hadron H 

must satisfy 


< 0\r)H\H > = Xhvh^ ( 12 ) 

where \h corresponds to the wave function at the origin in a quark/gluon model and, e.g., vh 
is a Dirac spinor for a spin 1/2 baryon. Clearly, the current does not allow one to obtain all 
the information about a hadron, as a Bethe-Salpeter amplitude or a wave function in a quantum 
mechanical system would, but the fact that one starts with local operators for complex systems is 
an enormous simplification. The correlator is defined by 

n(p) = <0\T[Ti{x)fj{0)]\0> . (13) 

If the correlator is expressed in terms of a dispersion we refer to it as the phenonmenological 
correlator, since the dispersion relation gives it in terms of physical states and properties. With 
no subtractions the dispersion relation for the correlator is 

(14) 

If there is a bound state in the spectrum of hadrons with the quantum numbers of the H of interest, 
the spectral function will have the form 

7m[nH(s)] = (15) 

For example let us consider a scalar meson for which vg would just be a plane wave in momentum 
space. Then the form of Eq.(15) leads to the phenomenological expression for the correlation 
function 

where n(p)^"‘ is the continuum contribution to the dispersion relation from Imp/rC^)]- 


( 16 ) 



760 


L S Kisslinger 


The QCD calculation is carried out by explicitly using the current tjh in the expression for the 
correlator, Eq.(13), and caculating the Feynman-like diagrams. We shall see many below. This 
gives the microscopic QCD correlator, (p^). One would naively expect the sum rules to arise 
from 


np/ien(^2) ^ 


(17) 


but this is not useful, since the QCD series will not converge at low momenta, where the method is 
being used here. The breakthrough idea of SVZ was to make use of the analytic properties of the 
correlator to evaluate it at large Euclidean momentum, so that the operator product expansion 
can be used and rapidly converges. This is done by taking the Borel transform of both sides of 
Eq.(17). The Borel transform is defined as[5] 


B 


lim 

Q^yn-^cx>,Q^/n=M^ 


d .n 

nl ^d{Q'^y 


(18) 


with = -p^. A detailed discussion of this transformation and how effective it is for giving 
convergence is given in Ref. [6], Note that the Borel transform of a finite polynomial vanishes, so 
only terms sing ular at small p^ contribute. Two important Borel transforms are 

- 1 _ 

~ (r - l)!(M|)(*+i) 

BiQ^yiniQ^) = (19) 

Taking the Borel transform of the correlator one arrives at 

npAen(Jl^2) ^ (20) 


This is a QCD sum rule. Eq.(19) gives the key to the method. First note that for the phenomeno- 
logical side the polynomial falloiOF of the dispersion relation becomes an exponential falloff. This 
dramatically changes the weight to the states with mass near the Borel mass, which is near the 
[usually] lowest state being studied, and reduces the efifect of the continuum. Referring to Fig. 
1 and Eq.(16), note that the part of the spectral function, p{s) = Jmp], at s = Mfj 
becomes more important relative to the continuum part, for which s > M%. This greatly improves 
the possibility of finding Mjy accurately in the sum rule. The Borel transform also improves the 
convergence of the QCD side. 

We shall see many examples below, and discuss the techniques used to take into account the 
spin properties. 


1A,1 Finding Solutions to QCD Sum Rules 


Given a current for a hadron, one can always find a sum rule of the form shown in Eq.(20). 
The question of finding a solution is basic to the method. On the QCD side one calculates the 
processes up to high enough dimension so that the last calculated term is small, say about 1% 
of the largest term, and hopes that the higher dimensional terms will not be important. On the 
phenomenological side one must start with a “good” current; which means that the current for the 
hadron H must couple the one-H state to the vacuum so that the structure constant Ah in Eq.(srl) 
is not too small. With this good beginning the main error is in the treatment of the continuum. 
Oftm the main uncertainty is the value of the threshold parameter for the continuum, called sq. 
The ma i n cxiter^ for solutions is 1) that one has obtained a region of the Borel mass (the plateau) 
in which the sum rule Eq.(20) is satisfied with Mh well within the plateau region, 2)that the 
continuum contribution is not too large, usually less than 50%, and 3) that the highest dimension 
terms on the QCD ^de are very small. This is discussed in many publications. A good discussion 
the Bord transfconation and its significance in finding solutions is found in Refe.[5, 6]. 



QCD Sum Rules 


761 


2 Sum Rules For Masses: Two-Point Functions 


The study of hadronic masses with the QCD sum rule method is a direct application of the 
treatment of the two-point correlator outlined in the previous section. One picks an appropriate 
current for the hadron, Tyjf (x), and defines the correlator as in Eq.( 13 ). In this section we review 
the work on masses with particular emphasis on gluonic hadrons, since these are of greatest interest 
for current experiments. 

2.1 Light-Quark Meson and Baryon Masses 

The earliest applications of the QCD sum rule methods were for meson and baryon masses, 
and these have been reviewed in a number of works. Therefore we only give a brief review here. 
However, the study of mesons in nuclear matter and at finite temperature is of great current 
interest, and we give some detail for use in later sections. Also, the early work on scalar mesons 
was not accurate as the mixing with scalar glueballs must be considered. This is discussed in some 
detail in the next subsection. 

2 .1.1 Meson Masses 


The currents for the mesons are given in Ref. [ 7 ], with a detailed discussion of the tensor mesons 
in Ref.[8]. The currents for the scalar(s) (0'^+) , pseudoscalar (ps) ( 0 “"^), vector(v)(l ), and axial 
vector(av) mesons are of the form 

rfix) = q(x)Tq{x), ( 21 ) 

where q(x) are the u,d-quark fields contracted over their color labels and the F’s are the Dirac 
operators F = 1 , 75, 7^, and 757/1, respectively. The fifth Dirac operator, cr^^, is not used directly. 

The axial and tensor currents are more complicated: axial(a) ^ q); tensor (t) 

^ dv +7i/ 5/x)^; and the axial tensor(at) 2“+, ^ = «9(7/i75 du +7i^75 

The correlator for each meson, n^(p) is defined in Eq.( 13 ) with a current rf{y) as defined 
above. For the vector, axial vector and axial mesons the correlator is defined as 

while for the tensor and axial tensor mesons the correlator is defined[8] 

The QCD calculation of the correlators for the light-quark mesons, with the assumption of 
vanishing quark masses (m^ 0 ), consists of evaluating three processes shown in Fig. 4 , the 

perturbative diagram (plus a perturbative qluon exchange two-loop diagram not shown), the gluon 
condensate diagram and the four-guark diagrams. 

We now give the contribution of the four-quark diagram for each correlator, since this is of con- 
siderable interest in later sections of this review. Let us define Q as the four-quark matrix element 
for each meson correlator, H^. The Q are defined in terms of the five four-quark condensates 

Qr = ( 0 |:?Ft^^gFt^g:| 0 >, ( 24 ) 

where are the SU( 3 ) Gell-Mann color matrices and F are the five Dirac operators, and the the 
four-quark condensate 

|Jo|^ = (0| : qj^Lt‘^q'^qn'‘t''qf ■ |0). 

/ 


(22) 

( 23 ) 


( 25 ) 



762 


L S Kisslinger 



(a) perturbative 



(b) gluon condensate (c) four-quark condensate 


Fig 4. O.P.E. for meson correlator 


The Q have the following form: 


- s 

Q 

= 

Q‘ + ||.7o|^ 

Q 


0“‘+|iJoP. 

'XV 

Q 



xav 

Q 



^ a 

Q 

= 


'Xt 

Q 

= 

Q\ 

XQ>t 

Q 


( 3 “”. 


(26) 


With the phenomenological dispersion relation for the correlator given by Eq(16), one finds the 
form for the sum rules in = -p^ space 


Q‘^ + M^ 


+ n,^ 


contiQ^) = + ( 27 ) 


with g^ a structure constant to be determined and the other constants known. After the Borel 
transform, with M the Borel mass, this gives sum rules for the s,v and t mesons 




3_ . lla^ 

87r2^ 37r 


)M^ 


as<GG> 

Btt 


+ +cont. 


1 /-I . \ « j-2 . ois <G G > iros 

+ 12xM= + 


6 /■, 8a„<G-G> , , 

+ 4^asQ +cmt. 


(28) 

(29) 


The i^,av and at smn rules are the same as the s,v and t sum rules with substitution of the 
superscripts. By dividing the sum rules by their derivatives with respect to 1/M^ one can eliminate 
the unknown phenomenological structure constant, F^. Satisfactory agreement with experiment is 
found for all the mesons except the ps, which are pseudo goldstone bosons and the scalar mesons, 
which we discuss below. A recent discussion of methods of obtaining solutions to the sum rules is 
given in Ref. [9] 




QCD Sum Rules 


763 


2.1.2 Factorization of Four- Quark Condensates 


In the treatment of meson masses, which we have just discussed, a vacuum saturation hypothesis 
has been used for the evaluation of the four-quark condensates in order to avoid introducing new 
parameters. In Ref. [5] this approzimation is derived. The factorization assumption gives 

(0 |: :| 0> ~ -^^(Tr[rr] - (Trp])^) (30) 

y 

(see the discussion in Sec. 8 on the treatment of mesons at finite temperature), the approximation 
cannot be used for hadrons in nuclear sustems or in finite T matter. 

2.1.3 Heavy-Light Quark Mesons 


The heavy-light (Q-q) quark mesons are the charm-light D mesons and the bottom-light B 
mesons. In the early days of the use of the QCD sum rule method these mesons were studied[10, 11]. 
After the concept of heavy quark symmetry was introduced[12] there has been a great deal of work 
on these systems. For the B systems the masses and decay constants have been calculated using the 
static heavy-quark approximation and the l/mg expansion[25, 13, 14]; however, for the D systems 
the charm quark mass is might not be heavy enough for these approximations to be accurate. 

The QCD sum rule method was used in a detailed study of the vector and pseudoscalar B and 
D mesons without using the heavy-quark effective field theory assumptions[15]. The form of the 
QCD expression, carrying out the O.P.E. to dimenison 8, which should give very good accuracy, is 

n(g2) _ + C3 < > +(75 < • G)q > +(74 < OisG^ > + 

CeiQs < qq + ^7 < ><qq> . (31) 

The coefficients, labelled by the dimension of the local operators, are given in Ref. [15]. The 
heavy-quark condensates are assumed to be very small and are neglected. Satisfactory values for 
the vector and pseudoscalar masses were found. The leptonic decay constants, obtained from the 
structure constants on the phenomenological side of the sum rules, were found to be is = 95 MeV, 
= 103 MeV, fo = 130 MeV and fj, = 150 MeV for the ps and v B and D mesons, respectively. 

2.1.4 Light-Quark Baryons 


In the early days of the QCD sum rule method the light-quark baryons were studied[16, 17, 7]. 
Here we give a very brief review. See Ref.[17] for the sum rules and a detailed description of the 
results. Details about various aspects of the method are given in Ref. [18]. Since we shall discuss 
the nucleon and the A(1232) resonance in nuclear matter in Chapter 7, and discuss the nucleon 
in the subsection on isospin violations below in this chapter, we briefly review the use of QCD 
sum rules for these baryons. Applications to the [almost] stable baryon octet and other baryon 
resonances is very similar for the lightest state with each set of quantum numbers. The current 
most widely used for the proton is 

rjpix) = e“*‘'n“^(a:)C'7^u^(x)7'‘7®(f(a:), (32) 

where the colors a,b,c are summed over and C = -C^ is the charge conjugation operator. The 
choice for the proton current is not unique, but arguments have been given[16, 19] for this to be 
the best current for use in the sum rules. Keeping all processes through dimension nine, the two 
sum rules obtained as coefficients of p and I are 

M® MH c? a^ml mdaM^ nidaml 
M,/M ^continuum = + + 4 24 

mparn^ 

12 


( 33 ) 



764 


L S Kisslinger 


aM'^ ab 34 rridM^ rridbM^ 

“4 n'^ ^7rM2 4 ~ 32 

TTldCL TTliiCL"' 

— + — . ( 34 ) 

where a = -(2^)^ < qq >,b =< >,aml = (2n-f < g^qa ■ Gq >,P‘^ = {2-nY\^/A, with the 
Structure constant defined by < Ojr^plp >= Au, where v is the proton Dirac spinor. The quantity 
M in Eqs. (33,34) is the Borel mass. We have included the quark mass terms, with the u and 
d current quark masses for the discussion of isospin splitting below, but have not distinguished 
between the u and d quark condensates for reason given below. The continuum is treated by 
ensuring regularity at large M, with an important parameter in the fit being s^, the threshold 
cm energy squared for the continuum[5]. ^From these sums rules, corrected to account for the 
continuum and for the anamoulous dimensions of the operators as shown in the A (1232) sum rules 
below, the proton mass can be determined to an accuracy of about 10 per cent. The structure 
constant, A agrees with lattice gauge calculations to about a factor of two. 

The current with no derivatives for the A"^"^ is unique: 

r]A{xV = (35) 


and 

+ continuum = 


Defining the correlator with this current as 11^^, one finds two useful sum rules[20] by using Tr[n^] 
and Tr[pn^]. They are, respectively, 

3 80 576 6 

(36) 

and 






In Eqs.(34,35) the quantities a,b,mo and M are as defined above. The functions Eo = l-exp(-x), 
El = Eo -xexp(-x), E 2 = El - x^exp(-x) are introduced to regulate the large M behavior of the 
continuum, with x=So/M^ and Sq the continuum threshold parameter; and L=0.6211n(10M) takes 
into account the anamoulous dimensions of the various operators. In the analysis, described in 
detail in Ref.[20] it was found that Ma — (1-35 ± 10%). 

These are typical examples of the use of QCD sum rules to determine the lightest baryon 
mass for each set of quantum numbers. Recently there have studies of the negative-parity nucleon 
resonance[21] and other 1/2“ baryons[22] using the same formalism as in, e.g., Ref.[17] 

We now turn to gluonic hadrons. 


2.2 Glueballs, Scalar Glueball/Sigma 


Glueballs, hadrons with gluons as valence particles, are of great interest for the study of the 
nature of QCD. We expect that the study of glueballs and hybrids, hadrons with both gluons and 
qum’ks as valence particles, will dominate hadron spectroscopy in the near future. There have 
been many theoretical studies of glueballs using the QCD sum rule method[5, 23, 6, 24, 25, 26, 
28, 29, 30, 31]. For spectroscopy the scalar glueballs are important not only for their own sake 
but also because scalar mesons cannot be studied without considering scalar glueballs because of 
the important scalar meson-glueball mixing[29]. This is a very hot topic, which we discuss in this 
subsection. 



QCD Sum Rules 


765 


2.2.1 Glueballs 


The formalism for treating scalar glueballs can be found in Ref. [23] The scalar glueball current 
is taken to be 


Jg{x) = asG{x)-Gix). (38) 

Using the once subtracted dispersion relationship, which means that instead of writing the sum 
rule for the correlator (Eq.(13) with the current Jg), one uses (n^(< 32 ) . n'^(0))/Q^. After 
the Borel transform the sum rule is 

-1.0) + ^£;i(so) = 2^M* + 4a‘l{G-G) (39) 

M2 M^ ’ 

with r(®) = {gsfabcGl^Gl^G^^) and = (14(/a6cG^^<3tp)=^ - ifabcGl^Gl^f). The values of 
the parameters have been discussed in Refs. [25, 26, 28]. A low energy theorem[6] is used for the 
phenomenological residue, IIgCO) 3.5 < >. 

The remarkable result of the recent calculations of scalar glueballs[25, 26, 28, 27, 29] is the 
prediction of a light scalar glueball. The greatest uncertainty in the calculation is the value of 
In Ref.[29] the sum rule was solved for a range -.01 < < .OSGeV^, with the solutions 

predicting a scalar glueball in the range 300 MeV < Mgb < 600 MeV. It seems that if one uses the 
low energy theorem for the gluonic correlator one always finds a good sum rule solution for such a 
light scalar glueball. No such glueball has been found in lattice gauge calculations[34], however, the 
dominant nonperturbative effects in the QCD calculations include quark loops that have not yet 
been included in glueball calculations. We discuss our own conjecture of a scalar glueball/sigma 
system in the following subsection. 

The fo(1500), discovered in the Crystal Barrel experiment [35], is a scalar glueball candidate. 
From the two-meson branching ratios[36] it is evident that this resonance is not a pure glueball, 
but is likely to be a mixed scalar glueball-meson system. We return to this from the point of view 
of QCD sum rules in the subsection below. It should also be noted that the dominant decay of 
the fo(1500) into four pions (two sigmas). We return to this in the next subsection of the scalar 
glueball/sigma. 

The tensor glueball is also of great current interest. The ^(2230) discovered in the MARK in[37] 
and BES collaboration[38] experiments is a candidate for a tensor glueball, although its spin-parity 
has not yet been determined. It also shows a strong four-pion (two-sigma) decay, which might be 
a signal for glueballs, as we discuss below. The gauge invariant current for ofr the tensor glueball 
which has been used in QCD sum rule calculations is [6] 

Solutions using the QCD sum rule method are found in the range of 2.0-2. 7 GeV. These are values 
found in lattice gauge calculations. See Ref. [31] for a recent review of calculations of the tensor 
glueball. 

An important consideration for the study of glueballs is the mixing with mesons, and this is 
particularly important for the scalar case, where there is a low energy theorem[6] that is most 
important for sum rule calculations, as we shall discuss below. Estimates of the scalar glueball/ 
meson mixing[32] and of the tensor glueball-meson mbdng[33], without the use of the low-energy 
theorems of Ref.[6] have found very small mixing. 



766 


L S Kisslinger 


2.2.2 Scalar Glueball/ Sigma 


The sigma has been a part of nuclear physics for decades in various forms. When it was found 
that meson exchange potentials using the known mesons could not produce enough attraction to 
explain the nuclear force, a scalar meson, the <j, was introduced. It often has a mass range of 
600-800 Mev. In effective chiral field theories the a is introduced as a chiral partner to the pion. 
More recently, the sigma has been identified as a pi-pi low-energy resonance. That is what we refer 
to as the sigma in the present paper. 

In a K-matrix analysis of the 1=0, L=0 channel of the tt - tt scattering amplitude[39] a 
low-lying resonance was found. In this analysis the fo (980) and higher resonances were subtracted 
out. The remaining amplitude shows an almost elastic Breit-Wigner form: 


AKTT(L=0) „ M^Tg. 


(41) 


with Mg 400 Mev. This resonance is what we refer to as the sigma. 

Moreover, in an analysis of the dominant 4-x branching ratios of all of the current candidates 
for glueballs the BES group found[38] that this decay channel is completely dominated by- the 
two-(7 mechanism. GLUE IS STRONGLY COUPLED TO THE SIGMA! Furthermore, the mass 
of the sigma is in the range of our scalar glueball solution. 

^From these observations we have made our scalar glueball/sigma ansatz: The sigma phe- 
nomenon is a two-channel glueball, two-7r system. With this ansatz we can extract the sigma- 
glueball coupling constant: 


gg - Tg -ZeOMeV (42) 

If this picture is valid it will provide a signal for gluonic hadrons. 

2.3 Mixed Scalar Glueballs and Mesons 


It was pointed out in the early study of scalar glueballs[6] that there is a low-energy theorem 
for the coupling of the scalar glueball and meson currents 

j d^x(0\T{JG{x)J^{0)m = f{qq). (43) 

This indicates that there might be important mixing between the scalar glueballs and mesons. 
Indeed this is the case. In Ref.[29] the current for a mixed scalar meson-glueball 

Jo++ = pMgJm + (1— I I) Jq, (44) 

with Mo = 1 GeV, was used. This leads to a QCD sum rule with a microscopic side after the Borel 
transform 

+ + (45) 

Solving the sum rule for ^ = 0 gives the light scalar glueball solution discussed above. The immixed 
meson solution found with ^ = 1 was shown[40] to give a solution about with a mass about 1 GeV, 
which was interpreted as a successful fit to the 53(980). Most people working in meson spectroscopy 
mm believe that the fb(980) is not a meson (meaning a qq) state, but is more likely to be a more 
omnplicated state. 

The solutions found in Ref[29] were an almost meson solution with | /? 0.8 at about 1400 

MeV, corresponding to the 5o(1370) meson-like resonance and an almost glueball solution with 
I p 1^ 0.2 coresponding to 5} {1500) found in the Crystal Barrel experiment[35]. Therefore we see 



QCD Sum Rules 


767 



that in the channel with vacuum quantum numbers (0"*"*") meson-glueball mixing must be treated 
for an accurate theoretical description. 

The summary of the QCD sum rule studies of scalar hadrons at the present time is shown in 
Fig.5. The fo(980) is not shown, as discussed above. There are two established fo hadrons, the 
fo(1370) amd the fo(1500), which we have just discussed. If the fj(1710) proves to be another fo 
(i.e., J=0) then it is likely that that the three fo states are admixtures of light-quark, strange quark 
and glueball scalar systems, which would be consistent with lattice gauge calculations[34]. Such 
admixtures have been discussed in Ref[31]. It is now most important to seek experimental signals 
for such states. 

2.4 Hybrid Hadrons 

Hybrid hadrons have valence glue. The QCD sum rule method is and excellent tool for studying 
hybrids as they can be quite precisely defined, which is not the case for many quark models. For 
example, a current of the form 

is a hybrid meson, since the quark-anticolor system caries net color which combines with the valence 
glue to give a color = 0 hybrid. Mesons are excellent systems for hunting hybrids since there are 
exotic quantum numbers which cannot exist in qq systems. An example is = 1“”^. Early 
in the QCD sum rule days calculations of the were carried out[41, 42, 43], with the authors 
predicting this exotic state at about 1.5 Gev. Recently the Chrystal Ball group has reported a 1-+ 
state at about 1.4 GeV[44] seen in the rj — tt channel, in nice agreement with theory. Flux-tube 
models predict[45, 46] the state at about 1.8 GeV. The nature of the state as a four-quark or hybrid 
is not certain, and a study of the rj — n decay[47] gives valuable information about the nature of 
this exotic system. 

Since baryons do not have exotic quantum numbers, it is more difficult to determine that one 
has found a hybrid baryon. There have been a number of QCD sum rule calculations for the = 
1/2'^ hybrid, with the same quantum numbers as the nucleon[48, 49, 50]. In Ref.[50] the current 
used for the hybrid was 

flHix) = (47) 

which is conveniently renormalized. Note that the form of Eq.(47) with the properties of the SU(3) 
color generator, t'^ ensures that the three quarks are not in a color zero configuration and that the 
gluon is a valence particle. 




768 


L S Kisslinger 



(a) baryon 



(b) hybrid baryon 


Fig. 6 Lowest dimension correlator for baryon and hybrid baryon 


The correlator of the nucleon and the hybrid are illustrated in Fig.6 A very stable solution 
was found at Mh 500 MeV above the nucleon mass.In other words it is predicted that the 
Fii(1440), the so-called Roper resonance, is a hybrid, or at least that there is a hybrid at that 
mass. Subsequently [30], it was shown that the nucleon has very little hybrid admixture. Using 
the idea of the glueball/sigma discussed above, it was shown[51] that the decay of the Roper into a 
nucleon and a two-pi (sigma) could be a test of the hybrid nature of this baryon. The same applies 
to the decays of many hybrid hadrons 

2.5 Isospin Mass Splitings; Gauge Invariant Electromagnetic Ejffects; 
Electromagnetic Penguins 

In QCD the only sources of isospin violations are the quark mass differences, and the mass 
splittings of hadronic isospin multiplets in principle give important information about the current 
quark masses in the QCD Lagrangian. In quark models the sources of isospin mass splittings are 
electromagnetic effects and the constituent quark mass differences. For example, for the neutron- 
proton mass splitting the d quark must be heavier that the u quark since electromagnetic effects 
would make the proton heavier than the neutron. In the QCD sum rule method it is much less 
straightforward. As one can see from the QCD correlator for the proton, shown in Eqs. (33,34)" 
there are three sources if isospin splitting 

quark mass difference = - rriu 

condensate difference = <uu> — <dd> (48) 

electromagnetic effects 

In a study of the neutron-proton mass differences Eqs. (33,34) were modified to treat the resonances, 
the continuum and the anamoulous dimensions, and the I-spin splittings of the quark and mixed 
coiidensates[52]. Although it was not pointed out in that paper, one can show that if one neglects 
electromagnetic and condensate effects the sum rules predict 

rrid > mu Mp > Mn. (49) 

This is certainly counter-intuitive, since one expects that if the neutron with two d quarks and one 
u quark would be more massive that the proton with one d quark and two u quarks if the d-quark 
is more massive than the u-quark. This implies that the I-spin splittings of the condensates, which 
in turn must arise form the d-u mass differences, give a larger effect than the quark mass difference 
itself. Since we are dealing with nonperturbative phenomena, we can expect such suprises. We shall 
not ^ve a review of the QCD sum rule calculations of the baryon I-spin mass diferences, since there 
is a problem in all of the calculations, including our own in treating the electromagnetic corrections. 
We now discuss this. 



QCD Sum Rules 


769 



Fig. 7 Photon loops, usual QED diagrams 




Fig. 8 Additional photon loop diagrams for electrically charged meson 


2.5.1 Isospin Splittings of Charm and Bottom Mesons: Gauge Invariance 


The heavy-light quark mesons are excellent systems in which to study isospin violations since 
the nonperturbative QCD effects are smaller than in light-quark systems. As discussed above the 
QCD sum rule method gives satisfactory agreement with experiment for masses of the vector and 
pseudoscalar B and D mesons. For calculations of isospin mass splittings one must include the 
effects shown in Eq.(48). This was done in a calculation[53] of B(ps), B^ (vector), D and D* mass 
splittings. The startling thing was that the standard two-loop photon diagrams expected to give 
the correct QED effects to order aQED^ shown in Pig. 7, are not gauge invariant. I.e., if one makes 
a gauge transformation 


Afi (x) ^ Afj, (x) - d^A{x) 

q{x) ^ (50) 

Q{x) 

where q and Q are the light-quark and heavy-quark fields, is the electromagnetic field and A 
is the gauge function, then the correlator (say for a scalar meson) 

n(g2) = i j < r[J(x) J(0)] >, (51) 

with J{x) = q{x)Q{x)^ is not gauge invariant. In order to obtain a gauge invariant solution, 
consider the correlator with the QED link operator 

n( 92 ) = i j <T[J{x)exp[iQop j'^^dv-A{y)]J{y)]>, (52) 

where Qop is the charge operator. By making an expansion in it was shown[54] that the 
additional vertex diagrams shown in Fig. 8 must be included, and that with all of the two loop 
diagrams shown in Figs.7,8 the calculation is gauge invariant. If one uses the quark mass difference 
mi—rriu = 3.8 MeV, estimated from broken chiral symmetry[55], and takes values for the parameter 



770 


L S Kisslinger 



Y 

Fig.9 QED Penguin diagram 


7 for the quark condensate isospin splitting 


<dd> 

<uu> 


(53) 


in the range -.002 ->• -.0079, as estimated in fits to the neutron-proton mass splitting[52, 56] then 
one finds[54] 


M^-Ml 


= 4.3 4.7 Mev 

= 2.6 3.0 Mev 

= —3.2 -3.0 Mev 

= -3.6 ^-3.5 Mev 


(54) 


for the pseudoscalax and vector, D and B splittings, resp. These results are quite satisfactory 
for the charmed (D) mesons, but the mass splittings for the bottom mesons is much too large 
in comparison with experiment[57]. As expected, the nonperturbative quark condensate I-spin 
violation effect is small, so to the extent that the QED effects have been taken care of correctly, 
these mass splittings should be a good source on information on the current quark mass splittings. 
Clearly there is a problem with the B mass splittings. We believe that the source of the error is a 
missing process, the QED penguin diagrams. We now discuss this. 


2.5.2 QED Penguins and (B,D) I-spin Mass Splittings 


Several years ago it was pointed out that there is a QED effect that modifies the quark-gluon 
vertex in a manner analogous to the so-called penguin modification of the quark gluon by the weak 
fields[58]. For the heavy-light mesons these QED penguin modifications are illustrated in Fig. 9, 
in which the penguin modification of the two-loop perturbative gluon is illustrated. This diagram 
and penguins for the mixed quark condensate processes were included in the calculation of Ref. [54] 
in Ref. [59]. It was shown that with the penguins the theory is consistent with the very small B 
isc^pin splitting, but there were inaccuracies in the calculation. The effects in the D systems were 
small. 



QCD Sum Rules 


771 


3 Sum Rules for Coupling Constants: Three-Point Func- 
tions 


The coupling of a field to a hadron involves a three-point function. Consider a current 
J^(y) = ^(y)r^(y) coupling to a hadron a with a composite field operator rjaix) to form another 
hadron [3 with a composite field operator r] 0 (x). In field theory this is treated by the three-point 
function 


V 0 aiP,q) = y'd^a:y'Ae“'^e-'*'-9<0|r[;j3(x)jf'(y)^c.(0)]10> (55) 

The treatment of three-point functions is much more complicated than two-point functions. More- 
over, for coupling constants, in which there is very little or zero momentum transfer at the vertex, 
the O.P.E. used for the two-point correlator via the Borel transform cannot be used directly as the 
point y in Eq.(55) is not at short distance from points x or 0 for small q. To solve this problem 
the external field method was introduced into QCD sum rules, which we now discuss. 

3.1 External Field Method: Two-Point Correlator 

In the study of the magnetic dipole moments of nucleons the problem of avoiding the O.P.E. 
in the y variable was discussed at length[60] in a three-point calculation using the general ideas 
of Ref. [5]. To avoid these difficulties a two-point correlator for a nucleon in an external field was 
introduced [61] for the case of electromagnetism. For an external current coupled to a proton 
the correlator is 


nr(p) = i j p < 0|r[77p(x)i?p(0)]|0 >jr . (56) 

where is the nucleon current given in Eq.(32). As can be seen from Eq. (56) the microscopic 
evaluation of 11^ (p) can be done using the operator product expansion, since the variable x is at 
short distance from the origin. This is done by an O.P.E. of the quark propagator in the presence 
of the the current 


5[(x) = <0lT[g(x)g(0)]|0>jr, 

= (57) 

where is the quark propagator coupled perturbatively to the current and is the 

nonperturbative quark propagator in the presence of the external current, J^. 

For the two-point treatment at low momentum transfer the O.P.E. for 5^ ’^-^(x) is justified as 
in the ordinary two-point function, giving 

-V x^r 

= ■^<0\:qTq:\0>jr+j^<0\:qa-Grq:\0>jr+... (58) 

This is illustrated in Fig. 10. This method has been applied to the couplings of a number of fields 
to the nucleon. 

3.1.1 Electromagnetic field: Nucleon magnetic Dipole Moments 


For the electromagnetic field the field-quark coupling is given by the quark charge, e, as shown 
in Fig, 10(a). The field polmizes the quark condensate, as shown in Fig. 10(b). The corresponding 
nonperturbative constant, called the magnetic quark susceptibility, x, is defined by 

= egxPiiv < qq >. 


< qcr^vq > 


(59) 



772 


L S KissUnger 



Fig. 10 (a) field-quark coupling (b) magnetic susceptibility (c) mixed susceptibility 


where is the QED field tensor. Two other susceptibilities are associated with the polarization 
of the mixed quark condensate[61]5 illustrated in Fig. 10(c). 

There are three covaxiants in the expression for the correlator in the external electromagnetic 
field, so one obtains three sum rules for the proton and neutron, two of which are used. By manip- 
ulating these equations one can eliminate the susceptibilities and obtain approximate expressions 
for the neutron and proton moments in terms of the quark condensate: 


f^p ~ 


/in 



2a 


), 


(60) 


where a= -( 27 r)^ < qq >~ .55GeV'^. With an estimated 10% erors the method is in agreement 
with the experimental values of = 2.79nm, fj.^P = -l.Qlnm. Estimates of the magnetic quark 
susceptibility give 

XaMp ~ -(3.4 ^ 6.0)GeF2. (61) 

Estimates of x have been made in Refs. [60, 62] using rho-meson dominance models. We shall return 
to the determination of vacuum condensate susceptibilities using a three-point method below. 


3.1.2 Other Fields and Coupling Constants 


The external field method has been used for the calculation of the coupling constants for a 
number of field-nucleon vertices. The method is similar to the calculation of the nucleon magnetic 
dipole moments reviewed in the previous subsection, so we shall not give details. There have been a 
number of applications of this two-point method for the calculation of the axial coupling constant 
(g^) [63, 64, 65] and for the isoscalar axial coupling constant (g^)[65]. From the appropriate 
sum rules the axial vector susceptibility has been estimated. There as also been an estimate of 
the vacuum tensor susceptibility[66] in work on the tensor charge of the nucleon, however, it has 
been pointed out[67] that the treatment of the nucleon tensor charge is subtle and that different 
theoretical treatments can give very different results. In all of these calculations there are new 
unknown susceptibilities. They are treated in a variety of ways, often manipulating the sum rule 
equations to eliminate them. We shall show that in an approximation to the basic three-point 
method there is an almost universal value for these susceptibilities. We shall discuss the strong 
and weak pion-nucleon coupling separately 

3.2 Three-point Function Method for Vacuum Susceptibilities 


Let us now return to the three-point function formulation, Eq.(55), for the current F coupling 
to the nucleon, 


V^ix,y) = <0\T[q{x)J^iy)f}{0)]\0>, 


(62) 



QCD Sum Rules 


773 



with rj given in Eq.( 32 ). Note that V^{x,y) has terms with vacuum matrix elements involving 
two, four, six and eight quark fields. It is the four-quark term that is of interest to us for the 
study of the vacuum susceptibilities. This is illustrated in Fig.ll The process of Fig.ll(a) is the 
coupling of the field to the perturbative quark, while the process of Fig. 11(b) corresponds to the 
polarization of the quark condensate by the F field in the external field method. In Ref.[68] the 
pion current with the form T = 75 was considered.The four-quark term of the three-point function 
corresponding to this pion current is 

Tr[Sf (x))’’C'7l. ( 63 ) 


The quark T susceptibility, corresponding to Figll(c) is given by 


= -i jd^yKOl: q%x)^{y)Tq%yW'{Q ) : |0 > ( 64 ) 


in the limit of 0. The expression for is evaluated using 

nonlocal quark condensate 

factorization and a 

< 0| : q( 0 )q{y) : |0 > = g(j/^) < 0| : f(0)g(0) : |0 > . 

( 65 ) 

The form for the quark condensate nonlocality in Ref. [68] was taken to be 


^ “ (l + /cV/ 8 ) 2 ’ 

(66) 

with the parameter k evaluated using the sea-quark distributions obtained with the QCD sum rule 
method of deep inelastic scattering of Ref. [ 69 ]. The resulting expression for the nonperturbative 
quark propagator in the F field is 

S^-^^ix) ~ r(?(a:)(<0i:g(0)g(0):| >712)2, 


Gix) = (-*) J d^yg(y^)g{{x - yf)- 

( 67 ) 

For the case of the pion susceptibility 


< 0| : gir37®g : |0 >T = < Oj : 35 : |0 >, 

(68) 


one finds that 


X a 


2o^ 

-(1.7-3.0)GeF2, 


( 69 ) 



774 


L S Kisslinger 


in agreeiiiGnt with the vaiue — ”1.88 GeV^, found in Ref.[70], which we discuss in the next 
subsection. Here we have changed the sign of from that used in Ref. [68] in order to agree with 
the sign choice in Ref. [70]. Note that this value is more than an order of magnitude different from 
the value one obtains using a two-point method with PCAC[71]. There has also been an estimate 
of using effective chiral quark theory[72]. 

In subsequent work[73] this formalism was shown to be in agreement with all known sus- 
ceptibilities. Therefore the expression of Eq.(67) is an approximate general relationship for all 
susceptibilities. 

3.3 Pion-Nucleon Strong and Weak Coupling 

The pion-nucleon coupling constant was determined many years ago from an extrapolation of 
TT — N scattering, as well as two-nucleon properties: 13.4. It has been estimated using QCD 

sum rules using two different two-point methods. 

3.3.1 Pion- Vacuum Correlator for p,riv 

Before the external field method was published, was estimated using a two-point correlator 
but with a low-energy pion in the initial state [7] rather than an external pion field as in Eq.(57) . 

W[{p) = i j 1 T[rjj,ix)%m 1 tt) (70) 

Later a much more detailed calculation of g^iv was carried out using this method[75]. Since 
there is no external field, in this method there is no pion susceptibility. The pion quark vacuum 
susceptibility is replaced by the matrix element 

<01i57®9k> = — f— , (71) 

Jw 

which has been evaluated using a soft pion theorem, where is the pion decay constant In Ref. [75] 
the difficultiy in getting a reliable value for is discussed in detail. 

3.3.2 External Pion Field for g^jv 


The external field method was also used for determining the tt — N strong and weak coupling 
constants, with the nucleon correlator in an external pion field: 

niip) = iJd^e^--P<0\T[ri^ix)fipi0)]\0>^. (72) 

The weak jjarity-violating (PV) pion-nucleon coupling constant, is of particular interest be- 
cause of its sensitivity to the neutral currents contribution of weak nonleptonic processes at low 
energies[76]. The theoretical prediction of is an important and challenging problem. The most 
accurate PV experiments have only shown[77, 76] that the upper limit for the magnitude of this 
coupling constant is 3-5 times smaller than the “best value” predicted by DDK [78] on the basis of a 
quark model and somewhat smaller than that in a similar calculation carried out more recently[79] 
Since that time others have tried to estimate by means of chiral soliton models[80, 81] and 
QCD sum rules(82]. 

Recently the QCD sum rule external field method was used for the simultaneous calculation of 
both g,jv and f,rjv, making use of an important observation illustrated in Fig. 12. In comparison of 
the lowest-dimensional diagrams for [Figs. 12a,b,c,d] and those for Un [Figs. 12e,f,g,h], note 
that all diagrams in which the weak gauge boson, Z”, is exchanged between the quark coupled to 
the pion and one of the other quarks vanish! That is, the Z° is exchanged only between spectator 
quarks. The weak W* exchanges vanish as expected firom known symmetries[78, 83]. This enables 



QCD Sum Rules 


775 



0 

1 





o 




Fig. 12 Lowest dimension diagrams for strong and weak rt-N coupling 


one to use the known value of g^risr ^ 13.4 to estimate the most important unknown in the problem, 
the pion quark condensate, The result is that — — l*88GeF^, in agreement with the three- 
point estimate of Ref. [68]. The predicted value of the weak parity violating tt — A^ coupling constant 
is f^riv = (3.04 ± .01) X lO”*^. A value close to this vras obtained using a chiral quark model[72]. 

4 Form Factors and Reactions 


To treat the elastic form factor of a hadron one must consider three-point functions. Since there 
is an extra variable, the momentum transfer in comparison to the two-point correlators used to 
obtain masses, in the QCD sum rule approach one must deal with double dispersion relations, 


F(QlQlQ^) 



P(SuS2,Q^) 


(73) 


where p(si,S 2 ,Q^) is the spectral function and the variables used axe (QijQl) == “(PijPD? 
external momenta, and For the phenomeological side of the sum rule, the elastic form 

factor is obtained by placing the external momenta on shell, so that 51 =^ 2 = M^. For the study 
of nonperturbative as well as perturbative QCD one wishes to study F(Q^) for all values 0 00 , 

and the challenge is to treat the low and medium values of <3^* 


4.1 Form Factors and Nonlocal Condensates 


One approach is to treating the low and medium momentum transfer form factors is to intro- 
duce nonlocal condensates [84], as was done in a study of the pion form factor [85]. The nonlocal 
condensate, defined in Eq,(65), replaces the OPE for the nonperturbative quark propagator, as 
shown in Eq.(lO). This enables one in principle to treat low Q^. In efiect the condensates carry 
momentum, as has been discussed in several publicati6ns[86, 87, 88]. 

The phenomenological side of F(Qj,Q 2 j Q^) includes the pion elastic form factor by a term 

= 'jr^f^F„{Q^)5{si-ml)5{s2-ml), (74) 

which enables one to obtain sum rule expressions for The study in Ref. [85] shows that the 

soft part of the form factor (i.e., the triangle diagram dominates at about 2GeV^, and must 
be considered at least until Q^ is about 10 GeV^. The program needed to carry out a detailed 
calculation is discussed. 

4.1.1 Pion Wave Function 

The sum rule methods that we have discussed above start with a local composite field operator 
for each hadron, the current of the hadron. For example, for a F meson the current is rf{x) = 



776 


L S Kisslinger 


q{x)rq(x), with < 0\t]^\H >= where is a spinor form and X^amma is a structure function 
depending on the wave function at the origin. In this form there is no direct information about the 
spacial or momentum form of the wave function. In Ref. [89] it was pointed out that by considering 
the gauge-invariant form 

< 0\q{x)Texp[ig f (krAf^(a)]q(0)\T{p) > (75) 

Jo 

one can study the w’ave function For the pion the axial operator F = 7/^75 Is used, and 

in Ref[89] the 7 r+ was studied, so that the low-twist 7r+ wave function can be extracted from the 
matrix element 


< 0|d(x)7^75u(a:)|7r'^ > (76) 

By expanding in the gauge field, Afi, expressions for the moments of the wave functions 

<r> ^ (77) 

where /i is the renormalization point, were obtained. In Ref. [90] the sum rule expressions were 
changed to make use of the nonlocal condensates. The moments that were obtained are very 
different from Ref.[89] and are similar to those obtained using the asymptotic wave function. This 
again suggests that for medium distances/medim momentum transfers nonperturbative effects must 
be considered. We return to the pion form factor and wave function in the next section 

4.2 Light-Cone Sum Rules 

It has long been known that for the relativistic study of composite states a light-cone repre- 
sentation of the field theory (or of Hamiltonian quantum mechanics) has many advantages[91, 92]. 
A Lorentz boost can be made with kinematic operators in a light-cone representation, while in 
the standard space-time representation the interactions are involved in all boosts. The light-cone 
QCD sum rules approach is an expansion in powers of the deviation from the light cone, rather 
than the O.P.E. of the original QCD sum rules[5]. The method is reviewed in Ref.[93]. The early 
applications were in heavy-quark physics[94, 95, 96]. Here we review the applications to the pion 
form factor and pion wave function. 

4.2.1 Pion Form Factor and Light-Cone Sum Rules 


The starting point of the light-cone sum rule method for the pion elastic form factor is similar 
to what we have referred to as the pion- vacuum correlator for the pion coupling constant, Eq(70), 
in that instead of starting with a vacuum matrix element one starts with a matrix element between 
the vacuum and one-pion state. In obtaining the pion is at zero momentum[75], while for the 
form factor one must consider fimte . This method allows one to treat the form factor as a 
two-point function rather than as a three-point function. The starting point[97] for the tt"^ form 
factor is 

= i j <Q\Jl{Q)Jl”^{x)\-K+ >, (78) 

where the axial vector current, and is the electromagnetic current. The pion 

form factor, Fy(Q^) is obtained from the pole in the dispersion relation for the phenomenological 
side of 




2i 




(P - 


(79) 



QCD Sum Rules 


777 



Fig. 13 D.I.S. (a) high-Q quark distributions, (b) sea-quark distributions 


In Ref.[97] the form is expressed in terms of the pion wave function of Ref.[89] to leading order. The 
results depend on the model of the pion wave function, but the authors are in general agreement 
with Ref. [85] that soft physics must be considered for the pion form factor up to at least 10 GeV^. 

4.2.2 Pion Wave and Light-Cone Sum Rules 


The light-cone sum rule method has also been used for a study of the pion wave function[98]. 
Starting with the expression for the low-twist pion wave function, Eq.(76), as in Ref.[89], 

< 0\d{x)jfj,'ysu{x)\7r'^{p) > = iUPfj, [ + twist four+ O(x^), (80) 

Jo 

where is the twist-two wave function. Going onto the light cone and treating the 7 ^. component, 
the left-hand side is expressed in terms of a model light-cone gaussian wave function, allowing one 
to extract (p^r and the twist four light-cone wave function. The results of Ref. [98] are that the wave 
function, including the twist four part is similar to the asymptotic wave function, which is different 
from the original conclusion of Ref. [89], 

4,3 Deep Inelastic Scattering; Quark, Sea-Quark Distributions 


Deep inelastic scattering (DIS) on nucleons is given by the lowest-twist mechanism, the “hand- 
bag diagram” illustrated in Fig. 13a For electromagnetic DIS is treated by a four-point hadronic 
tensor 


T^v(p,q) = ilcl^xd^yc^ze^’‘^e'P^y-^^<0\T[r,{y)r”^{x)r’^mizm (81) 

Such a four-point study of DIS to obtain quark distribution functions was carried out in the QCD 
sum rule framework for lepton nucleon DIS[ 88 ] in a calculation of valence quark distributions. 

DIS can also be treated with a three point function, which is less complicated. This was done 
in Ref.[69]. The starting point is the twist -2 spin-averaged quark distribution, q(x,At^)[99] 

y* d(T < p|q( 0)7 • nexp[ig j dijn ■ A{ri)]q{<T)\p >^2 . (82) 

Through an OPE one can show that 

q{x,p^) = 13 I (^) < ^9 >1^^’ (®^) 

t=l ' ^ ' 


Oi = qj ■ n{iD ■ n)^' ^'>q, 


with 


(84) 



778 


L S Kisslinger 


where D^=di^- Iff. 4^ is the covaxiant deri\'ative. 

Using the QCD sum rule method consider the three point correlator 

T>) = -J < 0\T[nix)Oi{0)v{y)]\0 > . 


The double pole term in the phenomenological dispersion relation is 




2 P + Mn 


< 0 \>. 


(85) 


(86) 


Using the typical sum rule methods, with the light-cone integrals discussed in Ref. [69], one can 
estimate < 0^ > and from this the quark distribution functions. 

It was observed in Ref. [69] that the DIS on the quark condensate itself, shown in Fig. 13 b 
gives an extimate of the sea-quark distributions. By using nonlocal quark condensates defined in 
Eq.65 with g(y^) of monopole form 


g(2/^) 


1 

(l + AV/8)’ 


(87) 


one finds a sea-quark distribution in reasonable agreement with the emperical value[100]. Subse- 
quently it was shown[68] that the monopole form does not have satisfactory analytic properties 
and a similar fit was made with a dipole form (Eq.66). 


4.3.1 Sea-Quark Anisotropy 


The NMC/CERN experiments[101] gave evidence for the violation of the Gottfried sum rule 
and the recent Fermilab/E866[102, 103] Drell-Yan measurments show that for small Bjorken x the 
ratio d{x)/u{x), down to up sea quark distributions, is considerably larger than 1.0, while both 
perturbative[104] and nonperturbative QCD calculations [69] find d{x) u(x). In order to solve 
this problem there have been many theoretical calculations of DIS using the concept of a meson 
cloud based on the one-pion interaction of nucleons[105] (see Ref. [106] for a review). It is obvious 
that if one only considers the -neutron system as an additional component of the proton, that 
there are more d than u sea quarks, but the tt" A components tend to cancel this effect. A related 
model is the chiral quark model which has been used for flavor and spin properties of nucleons[107]. 
Within the QCD sum rule method a model correlator with a meson cloud has been proposed[108] 
and used for the study of nucleon magnatic dipole moments, but has not yet been applied to DIS. 


5 Dyson-Schwinger, B-S Formalism and Sum Rules 


The Dyson-Schwinger [D-S] for a fermion propagator [109] is an integral equation for the prop- 
agator in terms of the field with which the fermion interacts and the fermion-field vertex, all of 
which are dressed. For QCD, if we write the dressed (perturbative plus nonperturbative) inverse 
quark propagator as 


Sg(j)) ^ = ip + mg + 'E(p), (88) 

then the D-S equation is 

= j (^9s-0“t(p-ff)7MY'S'9(ff)rt(ff,p) (89) 

with the dressed gluon propagator, the color SU{3) matrix and the dressed gluon- 

quark vertex. D-S really consists of a coupled set of complicated integral equations involving the 



QCD Sum Rules 


779 


quark and gluon propagators and vertex. This formalism has been widely used for the study of 
nonperturbative QCD in recent years[110, 111, 112]. The calculations have used models for 
This enables one to use known properties of. the quark propagator, such as the condensates and 
the space-time structure of the nonlocal condensate, to constrain the Gluon propagator. For the 

study or the condensates often the “rainbow” approximation, with r^(g,p) => 7^,^. Working in a 
Feynman-like gauge, the gluon propagator has the form 

Dlliq) = (90) 

With the D(q) a function chosen to give both long-distance confinement and ultra-violet behavior 
of the gluon propagator. An example is given below in the subsection on D-S and instantons quark 
propagators. Rewriting Eq.(88) as 

Sq(p)~^ = ipA(p^) + B(p^) (91) 


The D-S equation can be written as a couples set of integral equations 

[A{p^)-l\p'^ = -qI j (^-^^ “ ?M 2 ( 92 ) VB2(g2)^ ■ ^ 

B ( p ^) = -jgl J - q ) ^2^2 (^2) + p2 (^2) • ( 92 ) 

iFrom the solutions for A(q^) and B(q^) one can obtain the quark condensate[113], the mixed 
condensate[114] and the quark condensate nonlocality [115] defined in Eq.(65): 


<: ?(0)g(0) ;> = 

<0|:g(0)5(T-G(0)g(0):|0> = 


5(y^) = 


^ 

47r2 

_9_ 

47r2 


J dss 

J dss[s 


Bis) 


sA^s) + B^is) 
B(s)(2-A(s)) 


sA^(s) + BHs) 
81B(s)[2sA(s)(A(s) - 1) + ^^(s)] 




16(5^2 (s) + 52 (s)) 

B(s) 


dss 


sA2(s) + B2(s) 


Jl(\/si2) 


\/sx^ 


(93) 


With suitable choices of the function D(q^) one can fit the condensates and the nonlocal condensate. 
This provides important constraints on the nonperturbative gluon propagator. 

From Eqs. (92,93) one can see how the results of QCD sum rules can provide information for 
detailed model calculations of hadronic structure. The fit to the condensate information constrains 
the nonperturbative part of the gluon propagator, enabling one to carry out improved Bethe- 
Salpeter models of the hadrons. From this one can attempt much more detailed predictions of 
hadronic form factors, transition amplitudes, and so forth, than can be done with the QCD sum 
rules themselves. We anticipate that the productive interplay between the QCD sum rule and 
Bethe-Saltpeter/Dyson-Schwinger studies will enable theorists to learn more about QCD firom 
medium-energy experiments. 


6 Instantons and QCD Sum Rules 

A theory of the non-pertubative quark propagator is obtained by using the instanton solutions to 
the gluon field A“ and the solutions for a quark propagating in such an instanton medium[116]. 
In Refs. [5, 6] the instanton solutions for the gluon field were discussed in relation to the gluon 
condensate of the QCD sum rules; and the use of the quark propagator in the instanton field 
in the application of QCD sum rule method for hadronic properties wm considered in the early 
days of the method [117, 118]. In these references very detailed discussions of the use of t’Hooft’s 



780 


L S Kisslinger 




(a) quark/instanton correlator (b) correlations 

Fig. 14 Meson correlator with quark propagator in instanton medium 


zero mode solutions were shown to be inconsistent with known applicability of a dilute instanton 
gas model, and it was concluded that interactions modifying the instantons must be included for 
meaningful hadronic physics. In Ref.[117} the idea of an instanton liquid model was discussed, but 
as painted out there even in classical physics liquids are very complicated. Consequently, except 
for the possibility of large numerical lattice calculations, one must rely on models for attempting 
to include instanton effects into realistic hadronic and nuclear calculations. We discuss some work 
on microscopic QCD models below. 

If one includes only the zero-mode contributions, the space-time form of quark propagator in 
the instanton medium is[116, 117] 


Sq{x,y) 

'ipoix) 


^PT I 


7rx{x^ 


px 

-l-p2)~3/2 


(l-Tsk, 


(94) 


where p the instanton size, = 1 ^ind the instanton is assumed to be at the origin. The effective 
mass, TTieff, corresponds to the constituent quark mass of quark models. With this picture the 
QCD sum rule processes for a meson correlator would correspond to the diagrams shown in Fig. 
14. Comparing Figs. 4 and 14, Fig. 14a is the correlator corresponding to Fig. 4a with the 
perturbative quark propagator replaced by the quark propagator in the instanton medium. Fig. 
14b corresponds to correlations. If instantons provide all of the nonperturbative QCD effects the 
gluon condensate prosesses. Fig. 4b, would included in the processes of Fig. 14. We shall return 
to this below. 

A self-consistent model of the quark propagator including modifications of the instanton medium 
by the light quarks has been developed[119], and consistency with the known value of the quark 
condensate, < qg >, and the constituent quark mass (about 345 MeV, consistent with quark mod- 
els) is obtained. Meson correlators were treated using this model[120] and included correlations 
shown schematically in Fig. 14b. An improved model for the quark propagator in the instanton 
vacuum that goes beyond the approximation of Eq.(94) was developed[121] and an explicit form 
for was found. With the notation used above that ^^(p)'"^ = iA(jP)p + the instanton 

solution obtained for the inverse quark propagator is 


A/(p)-L0 = 0.0 

B,(p) = Kp-^fi^p) 

m = |-(3Jo(p)+/2(p))xi^i(p), 


(95) 


where K — 0.29 GeV ^ , p is in units of GeV and li, Ki are standard Bessel-type functions. This 
model predicts that the effective constituent quark mass at low momentum B/(0) ~ 420 MeV with 
a 1/p® feUoff. We return to this in a test of self-consistency using the D-S formalism. 



QCD Sum Rules 


781 


6.1 Instanton Qucirk Propagator and QCD Sum Rules 


Recently models of quark propagators in the instanton medium have been used in QCD sum 
rules. In a study of the nucleon correlator using the standard current of Eq.(32) but with a quark 
propagator of the form of Eq.(94) with rrieff taken as 


^e// 


^ 0 o 


(96) 


the two QCD sum rules for the coefficients of p and /, Eqs.(33,34), were derived[122]. Although 
the instanton contributions to the sum rules were much smaller than the OPE contributions, the 
authors found a marked improvement in the stability of the sum rules. This is quite important 
as only with stable sum rules can one be assured of a good solution. In a related calculation[123] 
of the magnetic dipole moments of nucleons it was shown that the stability of the sum rules was 
improved, and that a third sum rule considered unreliable in the original effective field calculation 
of the moments[16] became stable. 


6.2 D-S Study of Instanton Quark Propagator 

As discussed in the previous section, the D-S formalism has been used to provide useful constraints 
on the nonperturbative gluon propagator from known properties of the quark condensates: the 
values of local and nonlocal quark condensates. Here we turn this around. By this formalism, 
given a form of the D(q^) function, Eq.(90), with the confining long-distance as well as ultra-violet 
short distance form of the gluon propagator, and the gluon-quark vertex, r^(g,p), one can test the 
self-consistency of a model dressed quark propagator. The form of the D(q) function used in the 
D-S calculations[113, 114, 115] is 


9s‘^D{s) 


2 ^^ -4. 47r2d 

STT^—re ^ +Cu ■ — ; — r, 

-1- e) 


(97) 


with the parameter Cu = (1.0, 0.0) to (include, neglect) the perturbative ultra-violet behavior. The 
strength parameter X and range parameter A are determined by solving the coupled Dyson- 
Schwinger equations, Eq.(92), and fitting /^, the pion decay constant, and the quark condensate 
through Eq.(93). For the Feynman gauge with Cu = 0.0 these parameters are [115] X = 1.4 GeV 
and A = 2.0 X lO"^ GeV^. 

In a recent study using this D-S formalism[124] the instanton quark propagator derived in 
Ref.[121] and given in Eq.(95) was tested for self consistency using forms for D(q) that fit the 
condensates, such as the one just mentioned. The constant K in Eq.(95) was modified so that 
Bi{0) 'y. 313 MeV, consistent with constituent quark models. The results of this calculation are 
that although the quark condensate can sometimes be self-consistently reproduced, the mixed 
condensate is in serious error. The mixed condensate has been estimated[125] from the solution of 
Ref.[121] and found to be about a factor of two too large. In Ref[124] the mixed condensate was 
found to be a factor of five too large. 

These results suggest that the present models for the quark propagator in the instanton medium 
do not include all of the nonperturbative QCD needed for hadronic properties. It is know that 
present instanton models are not confining. This suggests that the current instanton quark prop- 
agator models can account for the short-distance perturbative and the mid-range nonperturbative 
QCD physics, but for the long-range confining nonperturbative effects gluonic condensates must 
still be included- This is consistent with the original observations [6] of the difficulty of including 
instanton effects in microscopic QCD calculatons of hadronic properties. 



782 


L S Kisslinger 


7 Hadrons in Nuclear Matter 


A main motivation for the study of hadrons on nulcear matter is the {cr^uj) model[126, 127] 
which has been widely used in nuclear physics. It has long been known from the two-nucleon 
data that the N-N force has a short-range repulsion and a mid-range attraction (in addition to 
the long-range pion exchange force). The (<j, model extends this general idea to complex nuclei, 
with the effective interaction picture of the a a scalar* meson giving attraction and the cu meson 
a chargeless vector meson being mainly repulsive. By picking parameters one can fit the binding 
energy of nuclear matter. In a mean field approximation the nucleon is treated a a Dirac particle 
an effective potential, and the effective mass of the nucleon in nuclear matter, Af*, is found to be 
reduced by 20-40% in various models. 

The QCD sum rule method is a natural formalism for attempting to understand the physics of 
the effective nucleon mass in terms of QCD. Among other things there is a possibility of addressing 
the partial restoration of chiral symmetry at finite nuclear density, and the introduction of the 
concept of vacuum condensates at finite density is a natural approach, since the quark condensate 
is the parameter for chiral s^nnmetry breaking. The pioneering work in using QCD sum rules at 
finite density[128] discussed the main ideas as w’ell as the complications involved in such a program. 
Since then there have been a great deal of w’-ork by a number of authors on baryons and mesons in 
nuclear matter at finite density. 

7.1 Nucleons in Nuclear Matter 

The QCD sum rule treatment of a proton in nuclear matter have started with the nucleon 
current 77 , of Eq.(32), and a correlator defined with the vacuum state of Eq.(13) replaced by the 
nuclear state, \A >, 


n(g,P) = ^ld^e^^-^<A\T[r]{x)m]\A>, (98) 

with P the momentum of the entire nucleus. In addition to the work of Ref. [128], extensive studies 
have been carried out by the authors of Refs.[129] and [130]. For the vacuum correlator of the 
nucleon recall that there are two covariants, q and J, and therefore two sum rules. For the nucleon 
in the nucleus there is a third covariant, P, so that the in-medium correlator can be expressed as 

n(g,P) = n'(g,y )3 + n2(g,i/)/ + n^(g,i/)P, (99) 

and therefore there are three sum rules, which is of great help in finding solutions. It is also 
important to note that the in the medium correlator can be written as 

n(g,P) = n(g,P)2« +n(g,P)4» +n(g,P)8«, (100) 

with the superscripts representing two-quark, four-quaxk and six-quark matrix elements. For 
example, 

n(q,P)2« = ^£<ibc€a'b'c‘<A\j>^j5Sfixh5j''Tr[y^uSf{x)'Y^CSt^'T(x)C]\A>. (101) 

The new dynamics involved with the microscopic QCD calculation of the correlator is that the 
quarks and glue propagate in nuclear matter, so that one can think of these particles in effective 
optical potentials as depicted in Fig. 15. All of the processes shown in Fig. 15 except 15(h) axe 
two-quark terms, needed for the evaluation of n(g,P)2?. The ellipses indicate the new physics 
involved with propagation in the medium. Fig. 15b has a perturbative quark in the medium. Fig. 
15c has an in-medium quark condensate, Figs.lSe has an in-medium gluon condensate, and Figs 
15f,g have in-medium mixed condensate processes. The in-medium condensates axe evaluated to 



QCD Sum Rules 


783 



Fig. 15 Nucleon correlator in nuclear medium 


first order in the nuclear density, p, and processes higher order in p are not included, recognizing 
that the nuclear medium is not very dense. The in-medium two-quark condensate has the form 

= -^[<qq>p + <qPq>pP/P^]. ( 102 ) 

The important observation was made in the original work[128] that an estimate of the in-medium 
scalar density had been made to first order in p using PCAC[131] 

<qq>p = <qq> +^, (103) 

ZfYlq 

where gn is the pion-nucleon sigma term. All of the calculations have used this result. 

The great problem in carrying out this important program of research is the importance of four- 
quark condensates. A typical process is illustrated in Fig. 15h. A typical four-quark condensate 
term is 

n(g,P)"« = 2€a6cea'6'c' <^|7'‘755J^'(2^)757‘'Tr[7^^^“(a:)S“'(0)7.Cu‘'(0)s'(x)q|^>(104) 

Generally some form of factorization is used to treat these in-medium matrix elements of four 
quark fields with various Dirac and SU(3) operators. In Ref.[128] and also in Refs.[129, 130] these 
difficulties have been discussed in some detail, but the solution has not been found. To illustrate 
how serious the problem is, one can find sum rule solutions for the mass of the nucleon increasing 
with density at low density or dropping to the values found in Walecka models depending on the 
model assumptions. For this reason we do not show results of calculations in this review. Some 
constraints on the four-quark condensates were found in a study of the A(1232) in the medium[20], 
which will be briefly discussed in the next subsection. 

7.2 Other Baryons in Nuclear Matter 

The methods used for the study of the in-medium nucleon in nuclear matter[129] were applied 
to the treatment of the A(1115). The current, can be obtained form that of the nucleon, 

Eq.(32), by a transformation[7]. One main difference from the case of the nucleon is that the 
strange quark condensate in the medium must be used, which requires a modification of Eq.(103). 
The authors conclude once more that the method does not allow one to make predictions without 
solving the problem of the four-quark condensates, and once more different factorization schemes 
give very different results. One very interesting problem in A hypernuclear physics is the value of 
the spin-orbit interaction, and the authors point out that the QCD sum rule method could provide 
information. 

The A(1232) in nuclear matter has also been studied[132, 20]. In Ref[132} it was once more 
concluded that one could not make predictions without new knowledge of the many four-quark 




784 


L S Kisslinger 


condensates that are crucial for the in-medium QCD sum rule method. Ref[20] used quite a 
different approach: to attempt to put constraints on the in-medium four-quark condensates from 
experimental/theoretical knowledge of the mass and width of the delta in the nucleus. From 
the isobar doorway model fits to pion-nucleus elastic scattering and other reactions[133] one has 
learned that in finite nuclei the effective A mass, is almost the same as the free A mass, 
with the best fit being about ~ A/a + 10 MeV, while the width is broadened by about 10%. 
One must be careful in using these observations in that the A tends to be formed in the nuclear 
surface at perhaps 1/2 nuclear density. Using the current r^^ixy, given in Eq.(35), and defining 
the correlator in analogy with Eq.(98), one can obtain three useful sum rules from the 

correlators 


= rr[n^>:/]/4, (105) 

and two other defined in Ref. [20]. The sum rule derived from the the correlator Hf* keeping 
processes as shown in Fig. 15 up to dimension seven is 

(106) 

with Up = ~( 27 r)^ < qq >p,bp =< >p, a four quark condensate defined in Ref[20], and 

< >= with Vp a Rarita-Schwinger spinor. The other two sum rules are 

similar, but involve other in-medium condensate. 

With these three sum rules it was possible to constrain some of the four-quark condensates. It 
was observed that some of the suggested factorizations were ruled out, and that with the constraint 
on one of the four-quark condensates found from the study of the in-medium A the results for the 
in-medium nucleon would be significantly modified. The subject of baryons in nuclear matter is 
an important and interesting one, but is still an unsolved problem. In our section on mesons at 
finite T we return to the analysis of four-quark condensates. 


7*3 Mesons in Nuclear Matter 


Most of the theoretical and experimental research on mesons in nuclear matter, a very old 
subject in nuciear/particle physics, has been on vector mesons. Since the vector meson can couple 
directly to a photon, the decay of the in-medium p meson into lepton pairs provides a beautiful 
expaim^tal test. There is a very large literature on the properties of the p meson in nuclei. 
R^Dently QCD sum rule calculations have been carried out[lS4, 135]. The problem of factorization 
<rf the fcmr-quark condensates is still a major problem for this method. A recent review of light 
mdm mmmm in nuclear matter discusses the sum rule work as well as some of the other work in 
this area{136]. 

In the next section we take up the important area of hadrons in nuclei at finite temperature, 
and the recent work on the problem of treating four-quark condensates. 


8 Mesons in Finite Temperature Matter 


The main motivation for studying hadrons in nuclear matter is the possibility that in ac- 
celerator experiments it might be possible to create matter in the era of the early universe in 
which quarks and gluons are not confined, the quark-gluon plasma. Evidence from experiments 
at the AGS at BNl and the SPS at CERN that a hot, dense fireball is formed in relativistic 
heavy ion collisions[137], and signals that this is deconfined quark-gluon matter have been widely 
discussed[137]. In lattice gauge calculations (see, e.g., Ref.[138]) a phase transition from hadronic 
matter to deconfined matter occurs at a critical temperature of T = ~ 200MeU. See Ref.[118] 

for a review of this possible deconfining/chiral symmetry restoring phase transition. 



QCD Sum Rules 


785 




Fig. 16 Spectral functions at finite T for (a) T < T and(b)T > 


As with the theory of mesons in nuclear matter, most of the research has been devoted to the 
study of vector mesons, since their decay into photons or lepton pairs make their experimental 
study most attractive. As we shall see, from the point of view of QCD sum rule theory, all of 
the mesons must be studied simultaneously in coupled equations. As has been emphasized in this 
review, one of the crucial aspects of the successful treatment of a correlator to extract information 
about the lowest meson (or baryon) state with a given set of quantum numbers by the QCD sum 
rule method is a satisfactory treatment of the continuum, the higher states with those quantum 
numbers. This is illustrated in Fig. 1, in which there is a clear sepaxation in the spectral function 
between the lowest mass and the higher resonances and states, the continuum. The situation at 
finite T is illustrated in Fig. 16. For T < Tc as T begin to get near Tc the resonances become 
broader and the continuum spectral density drops toward the hadron mass being investigated. 
Therefore the criteria for stable solutions of the QCD sum rule equations might not be satisfied. 
For T > Tc the (q,q) states reach down to s=0, giving a spectral density illustrated in Fig. 16b. 
In that illustration it is assumed that mesons are beginning to form as T begins to approach Tc 
from the high-T region, as found in a parton cascade models[139]. One can see that it is difficult 
to treat the region of temperature T c:; Tc by the sum rule methods. 

The pioneering work on using the QCD sum rule method[140] (^entered on the low-T region. 
The starting point is the thermal-averaged correlator, n^(p,T), which for a meson with a current 
J^{x) — q{x)Tq{x) is 


= i I d^a;e“'’0(xo)((Olg(x)rg(x)g(O)r«(O)|O)), (107) 

where p = (w,p) and where ((...}) stands for the Gibbs average. For finite-T mesons the micro- 
scopic formulation differs from the T=0 sum rule formxilation in two ways: First, one replaces the 
perturbative quark loop process, shown in Fig. 4a, by a finite-T quark loop using the Matsub- 
ara formalism in which for thermal equilibrium there is no time evolution and time -i-i/T. In 



786 


L S Kisslinger 


Feynman diagrams the momentum (*^njp)j where 

ujn 27 rTn bosons (gluons), and 

ujn -4 27 rT(n+^) fermions (quarks), (108) 

and u runs over positive and negative integers (including zero). For example for the scalar case 
one finds the replacement for the first term in Eq.(27) 


Q^niQ^) (109) 

Second, the condensates are evaluated in the medium. The gluon condensate does not seem to 
have a significant T-dependence. a calculation of the perturbative gluon loop[141] gives 

For Tc 200 MeV one easily sees that > /lO. Therefore for the region 

T <Tc the gluon condensate can be considered to be almost constant. This of course implies that 
the gluon condensate does not vanish as T becomes greater than Tc? which in most interesting. 

Therefore the most significant T-dependent nonperturbative QCD effects for meson are the 
four-quark condensates: 

< Qr >T=< <0i : qrt^^qqTt^q : |0) >t . (110) 


Bochkarev and Shaposhnikov used a model for the temperature dependence of the pertinent 
four-quark condensate, thus avoiding the factorization approximation, Eq.(30), according to which 
each four-quark condensate becomes trivially proportional to the square of the familiar quark 
condensate, assuming saturation by the vacuum. A number of other authors have also calculated 
the mass of the rho meson using QCD sum rules at finite T [142, 143], generally using factorization 
of the four-quark condensates. With the factorization the meson sum rule equations separate, 
which enables these authors to calculate the p mass without considering the other mesons. The 
results depend on assumptions about the continuum distribution and the T-dependence of the 
quark condensate, but with reasonable assumptions the p mass is found to drop with increasing T. 

In recent work[145] a new formulation of QCD sum rules for mesons was developed by making 
use of the striking similarity of the correlator, BF{p,T) and the four quark condensates. Note 
that the spectral function of the correlator, Eq.(107), involves a retarded propagator, so that the 
relationship the the four-quark condensates, Eq.(llO), requires a model of the meson intermedi- 
ate states, as in Ref.[140]. Details are given in Ref.[145]. In this formulation one has sum rule 
expressions for both the meson masses and the four-quark condensates, which take the form 

mliT) = (T). M^, Q (T)), (111) 

for the scalar, vector and axial vector meson masses, and 

((Oljr7>'|0))(T) = iri’(T,m(r),5o(r),MS), (112) 

for the five four-quark condensates, with the Borel mass for each meson. In these equations 

r 

th^re are five threshold functions, Sq{T), which must be modeled, and Q (T) are four-quark finite 
T condensates, 

This formalism results in eight coupled equations. The models assumed for the threshold 
functions, Sq{T)j are described in Ref. [145]. Typical results for the meson masses are shown in 
Fig. 17. The most sinking results for masses are that the temperature dependences of the different 
mesons are quite different. Although the results depend on the spectral assumptions, the tensor 
and axial taasor mesons are found to be almost unstable as T approaches Tc, and this could provide 
an import^t mcpmmmtal signal. The results for the four-quark condensates are shown in Fig.18. 
The functions iJ^(T) in the figure are defined by 

((OlJrji'lO)) = lf{Tmj^J^\ 0 ), 


(113) 



QCD Sum Rules 


787 



Fig. 17 Meson masses as a fanction of T 


where (0| J^|0) are the T.=0 four-quark condensates for T=0. This figure, giving the ratio of R(T) 

for all the mesons relative to the scalar case, demonstrates the serious violation of factorization 
with increasing T. Vacuum saturation, the basis for factorization, implies that RF(T) is the same 
for all mesons. We see that even at a modest temperature of P « Tc/2 there is a considerable 
violation of factorization. One should note that at T=0 there is evidence that the factorization 
approximation of the four-quark condensates is not accurate. In a study of the ratio of the isovector 
hadronic to the muon pair production in e'^€~ data [144] estimated a value of the vector four-quark 
condensate is larger than the factorized value. Using this value in Ref. [145] a satisfactory fit to the 
rho-meson mass was foimd. The QCD sum rule calculations that have been carried out for meson 
masses at finite T canot be considered to be QCD predictions. They must make use of models to 
describe the physics as T approaches Tc- One hopes that with relativistic heavy ion experiments 
giving guidance the models will be improved and that in the future it might be possible to learn 
more about the chiral phase transition and QCD in the early universe. 


9 The Pomeron 


It has long been known that high energy elastic scattering and difeactive processes axe domi- 
nated by the Pomeron trajectory. Since mesons are not on the Pomeron trajectory, the Pomeron 
must be a gluonic system. As we have seen in earlier sectons, the QCD sum rule method is a 
valuable tool for studing gluonic hadrons, and I believe that the method will be very useful in 
understanding the structure of the pomeron and in making predictions for testing this nature. Al- 
though there is some recent work in this area, it is premature to include it in a review. It probably 
will be in reviews in this area in the future. 





788 


L S Kisslinger 



Fig. 18 Ratios of four-quark condensates as a function of T 


10 Outlook 


The method of QCD sum rules will be continue to be a valuable tool for the treatment of 
nonperturbative QCD in hadron spectroscopy, particularly for gluonic hadrons. There has been 
a long history of the study of gluebaJls and hybrids using the sum rule method. There is still 
more work to be done in this area. The mixing of mesons with glueballs and the nature of the 
p^ible light scalar glueball/sigma must be tested by experiments, and future experiments will 
give guidance for further theoretical work. A most important and interesting study will be that of 
the Pomeron. We expect that the the use of instanton solutions for the quark propagators will be 
a valuable method for incorporating some of the developments in lattice gauge calculations in the 
treatment of nonperturbative QCD. 

In all of th^ studies the most urgent research needed is for theory to provide reliable signals for 
gluonic systems, which requires more complicated correlators than the two-point correlators used 
for treating masses. There has been a great deal of progress in developing methods for treating 
coupling constants and other three point functions within the QCD sum rule method. The use of 
nonlocal condensates and the light-cone sum sum rule method are promising for theoretical studies 
of fom factors and vertex functions over a wide range of momentum transfers. One promising 
area IS m the interplay between QCD and electroweak theory, and the sum rule method could be 
particularly valuable is treating decays of heavy-light hadronic systems. We expect a important 
developements in the near future in these areas. 

Studies of herons m nuclear systems will be of increasing importance in the future. The 
QCD sum rule is most promising for several areas of nuclear astrophysics and cosmology We 
^apate a great deal of research on hadrons at finite temperature, with a valuable interplay 
betweOT exprament and theory for the study of matter near the temperature of the chiral phase 

transition and of the nature of the quark-gluon plasma. 




QCD Sum Rules 


789 


11 Acknowledgements 

The author thanks Mikkel Johnson for many helpful discussions, and would like to acknowledge 
the hospitality of LANL during the time that much of this review was being prepared. The work 
was supported in part by the National Science Foundation grant PHY-9722143 and in part by the 
U.S. Department of Energy. 

References 

[1] R. Kronig, J. Opt. Soc. Am, 12, 547 (1926); Physica 12, 543 (1946). 

[2] H.A. Kramers, Atti con. int. fisici, Como, 2, 545 (1927). 

[3] M. Gell-Mann, M.L. Goldberger and W.E. Thirring, Phys. Rev. 95, 1612 (1954). 

[4] M. Gell-Mann, R.J. Oakes and B. Renner, Phys. Rev. 175, 2195 (1968). 

[5] M.A. Shifman, A.I. Vainstein and V.I. Zakharov, Nucl. Phys. B147, 385, 448 (1979). 

[6] V. Novikov, M. Shifman, A. Vianstein, and V. Zakharov, Nucl. Phys. B191, 301 (1981). 

[7] L.J. Reinders, H.R. Rubinstein and S. Yazaki, Phys. Rep. 127, 1 (1985). 

[8] T.M. Ahev and M.A. Shifman, Phys. Lett. B112, 401 (1982). 

[9] D.B. Leinweber, Annals Phys. 254, 328 (1997), contains references to recent work. 

[10] E.V. Shuryak, Nucl. Phys. B198, 83 (1982) 

[11] T.M. Aliev and V.L. Eleskii, Sov. J. Nucl. Phys. 38, 936 (1983). 

[12] N. Isgur and M.B. Wise, Phys. Lett. B232, 113 (1989). 

[13] M. Neubert, Phys. Rev. D46, 1076 (1992). 

[14] E. Bagan, p. Ball, V.M. Braun and H.G. Dosch, Phys. Lett. B278, 457 (1992). 

[15] L.S. Kisslinger and Z. Li, Nucl. Phys. A570, 167c (1994). 

[16] B.L. Ioffe, Nucl. Phys. B188, 317; B191, 591(E) (1981). 

[17] V.M. Belyaev and B.L. Ioffe, Sov. Phys. JETP 56, 493 (1982). 

[18] D.B. Leinweber, Annals Phys. 198, 203 (1990). 

[19] B.L. Ioffe, Z. Phys C18, 67 (1983). 

[20] M.B. Johnson and L.S. Kisslinger, Phys. Rev. C52, 1022 (1995). 

[21] D. Jido, N. Kodama and M.Oka, Phys. Rev. D54, 4532 (1996). 

[22] D. Jido, M.Oka, ^d A. Hosaka, Nucl. Phys. A629, cl56 (1998). 

[23] V.A. Novikov, M.A. Shifinan, A.I. Vainstein and V.L Zakharov, Nucl. Phys. B 165, 67 
(1980). 

[24] P. Pascual and R. Tarrah, Phys. Lett. B 113 (1982) 495. 

[25] C.A. Dominguez and N. Paver, Zeit. Phys. C 31, 591 (1986). 



790 


L S Kisslinger 


[26] J. Bordes, Y. Gimenez and J.A. Penarrocha, Phys. LettB 223, 251 (1989). 

[27] E. Bagan and T.G. Steele, Phys. Lett. 243, 413 (1990). 

[28] J. Liu and D. Liu, J. Phys. G:Nucl. Part. Phys. 19, 373 (1993). 

[29] L.S. Kisslinger, J. Gardner and C. Vanderstraeten, Phys. Lett. B410, 1 (1997). 

[30] L.S. Kisslinger, NucL Phys. A629, 30c (1998). 

[31] S. Narison, Nucl. Phys. B509, 312 (1998). 

[32] S. Narison, N. Pak and N. Paver, Phys, Lett. B147, 162 (1984). 

[33] E. Bagan, A. Bramon and S. Narison, Phys. Lett. B196, 203 (1987). 

[34] G. Bali et ai. (UKQCD), Phys. Lett. B309, 378 (1993); J. Sexton, A. Vaccarino and D. 
\\^ingarten, Phys. Rev. Lett. 75, 4563 (1995). 

[35] Crystal Barrel Collaboration, Phys. Lett. B 355, 425 (1995). 

[36] C. Amsler and F.E. Close, Phys. Lett. B 353, 385 (1995); Phys. Rev. D 53, 295 (1996). 

[37] R.M. Baltrusaitis et al., Phys. Rev. Lett. 56, 107 (1986). 

[38] BES collaboration, J.Z. Bai et al., Phys. Rev. Lett 76, 3502 (19960. 

[39] B.S. Zou and D.V. Bugg, Phys. Rev. D50, 591 (1994). 

[40] L.J. Reinders, S. Yazaki and H.R. Rubinstein, Nucl. Phys. B 196, 125 (1982). 

[41] LI. Balitsky, D.I.Dyakonov and A.V. Yung, Phys. LettB112, 71 (1982). 

[42] J. Goverts,F. de Viron,D. Gusbin and J. Weyers, Nucl. Phys. B248, 1 (1984). 

[43] J.I. Latorre, S. Narrison, P. Pascual and R. Tarrach, Phys. Lett. B147, 169 (1984). 

[44] D.R. Thompson et al.[BNL/E852 Collaboration], Phys. Rev. Lett. 79, 1630 (1997). 

[45] F.E. Close and P.R. Pa^e, Nucl. Phys. B443, 233 (1995). 

[46] T. Barnes et al., Phys. Rev. D52, 5242 (1995). 

[47] J.I. Latorre, P. Pascual and S. Narrison, Z. Phys. C34, 347 (1987). 

[48] A.P. Martynenko, Sov. J. Nucl. Phys. bf 54, 488 (1991). 

[49] V.M. Braun, P. Gomicki, L. Mankiewicz and A. Schafer, Phys. Lett.B 302, 291 (1993). 

[M] L.S. Kisslinger and Z. Li, Phys. Rev. D 51, R5986 (1995). 

[51] L.S. Kisslinger and Z. Li, Phys. Lett. B445, 271 (1999). 

[52] K-C. Yang, W-Y,P Hwang, E.M. Henley and L.S. Kisslinger, Phys. Rev. D47, 3001 (1993). 

[53] L.S. Kisslinger and Z. Li, Chinese J. Phys. 32, 1213 (1994). 

[54] L.S. Kisslinger and Z. Li, Phys. Rev. Lett.74, 2168 (1995). 

[55] J. Gasser and H. Leutwyler, Phys. Rep. 87, 77 (1982). 



QCD Sum Rules 


791 


[56] V.L. Eletsky and B.L. Ioffe, Phys. Rev. D48, 1441 (1993). 

[57] Particle Data Group, phys. Rev. D45, Si (1992). 

[58] T. Goldman, K.R. Maltraan and G.J. Stephenson Jr., Phys. Lett. 228, 396 (1991). 

[59] L.S. Kisslinger, T. Goldman and Z. Li, Phys. Lett. B416, 263 (1998). 

[60] I.I. Balitsky and A. V. Yung, Phys.Lett. B 129, 328 (1983). 

[61] B.L. Ioffe and A.V. Smilga, Nucl.Phys. B 232, 109 (1984). 

[62] V. M. Belyaev and Ya. I. Kogan, Sov. J. Nucl. Phys. 40, 659 (1984). 

[63] V. M. Belyaev and Ya. 1. Kogan, JETP Lett. 37 (1983) 730; Phys. Lett. 136B (1984) 273. 

[64] C. B. Chiu, J. Pasupathy ans S. J. Wilson, Phys. Rev. D 32 (1985) 1786. 

[65] E. M. Henley, W-Y. P. Hwang and L. S. Kisslinger, Phys. Rev. D 46 (1992) 431. 

[66] H. He and X. Ji, Phys. Rev. D54 (1996) 6897. 

[67] V. M. Belyaev and A. Oganesian, Phys. Lett. B395 1997) 307. 

[68] M.B. Johnson and L.S. Kisslinger, Phys. Rev. D57, 2847 (1998). 

[69] H. Jung and L.S. Kisslinger, Nucl Phys. A586, 682 (1995). 

[70] E. M. Henley, W-Y. P. Hwang and L. S. Kisslinger, Phys. Lett. B367, 21 (1996); B440, 449 
(1998). 

[71] V. M. Belyaev and Ya. 1. Kogan, Sov. J. Nucl. Phys.40, 659 (1984). 

[72] W-Y. P. Hwang, Z. Phys. C75, 701 (1997). 

[73] L.S. Kisslinger,.hep-ph/9804320, Phys. Rev. C (1999). 

[74] L.J. Reinders, H.R. Rubinstein and S. Yaaaki, Nucl. Phys. B213, 109 (1983); L.J. Reinders, 
Acta Phys. Polon. B15, 329 (1984). 

[75] H. Shiomi and T. Hatsuda, Nucl. Phys. A594, 294 (1995). 

[76] E.G. Adelberger and W.C. Haxton, Ann. Rev. Nucl. Part. Sci. 35, 501 (1985); J. Lang et 
al, Phys. Rev. C34, 1545 (1986). 

[77] C.A. Barnes et al. Phys. Rev. Lett. 40, 840 (1978); H.C. Evans et al, Phys. Rev. Lett. 55, 
791 (1985); Phys. Rev. C35, 1119 (1987); M. Bini et al, Phys. Rev. Lett. 55, 795, (1985); 
Phys. Rev. C38, 1195 (1988). 

[78] B. Desplanques, J.F. Donoghue, and B.R. Holstein, Ann. Phys.(NY) 124,149 (1980). 

[79] V.M. Dubovik and S.V. Zenkin, Ann. Phys. (NY) 172, 100 (1986). 

[80] N. Kaiser and U.G. Meissner, Nucl. Phys. A499, 699 (1989); 510, 1648 (1990); U. Meissner, 
Mod. Phys. Lett. A5, 1703 (1990). 

[81] D.M. Kaplan and M.J. Savage, Nucl. Phys. A556, 653 (1993). 

[82] V.M. Khatsimovskii, Sov. J. Nucl. Phys. 42,781 (1985). 



792 


L S Kisslinger 


[83] E.M. Henley, N.. Rev. Nucl. Sci. 19, 367 (1969); Chinese J. Phys. 30, 1 (1992). 

[84] S.V. Mikhailov and A.V. Radyushkin, Sov. J. Nucl. Phys. 49, 494 (1989). 

[85] A.P. Bakulev and A.V. Radyushkin, Phys. Lett. B271, 223 (1991). 

[86] A.V. Kolesnichenko, Sov. J. Nucl. Phys. 39, 968 (1984). 

[87] V.M. Belyaev and B.Y. Blok, Z. phys. C30, 279 (1986). 

[88] V.M. Belyaev and B.L. Ioffe, Nucl. Phys. B310, 548 (1988); B313, 647 (1989). 

[89] V.L. Chernyak and A.R. Zhitnitsky, Nucl. Phys. B201, 492 (1982); B214, 547(E) (1983). 

[90] S.V. Mikhailov and A.V. Radyushkin, Phys. Rev. D45, 1754 (1992). 

[91] P.A.M. Dirac, Rev. Mod. Phys. 21, 392 (1949) 

[92] B.D. Keister and W.N. Polyzou, “Relativistic Hamiltonian Dynamics in Nuclear and Particle 
Physics” , Adv. in Nucl. Phys, bf 20, p. 225. 

[93] V.M. Braun, M. Beyer, T. Mannel and H. Schroder, Proceedings of the IVth International 
Workshop on Progress in Hea\y Quark Physics, p.l05. 

[94] LI. BaUtsky, V.M. Bruan and A.V. Kolesnichenko, Nucl. Phys. B312, 509 (1989) 

[95] V.M. Braun and I.E. Filyanov, Z. Phys. C44, 157 (1989). 

[96] V.L. Chernyak and I.R. Zhitnitskii, Nucl. Phys. B345, 137 (1990). 

[97] V. Braun and I. Halpern, Phys. Lett. B328, 457 (1994). 

[98] V.M. Belyaev and M.B. Johnson, Phys. Lett. B423, 379 (1998); Mod. Phys. Lett. A13, 2909 
(1998). 

[^] R.L. Jaffe, Nucl. Phys. B229, 205 (1983). 

[100] D. Duke and J. Owens, phys. Rev. D30, 49 (1984). 

[101] P. Amaudruz et. aJ., Phys. Rev. Lett, 66, 2712 (1991); M. Ameodo et. al., D 50, R1 (1994). 

[102] E. A. Hawker et. al., Phys. Rev. Lett. 80, 3715 (1998). 

[103] J. C. Peng et. al., Phys. Rev. D 58, 92004 (1998) 

[104] D. A. Ross and C. T. Sachrajda, Nucl. Phys. B149, 497 (1979). 

[105] J. D. Sullivan, Phys. Rev. D 5, 1732 (1972). 

[106] J. Speth and A. W. Thomas, Adv. Nucl. Phys. 24, 83 (1998). 

[107] T. P. Cheng and L.-F. Li, Phys. Rev. Lett. 74, 2872 (1995); ”Non-Perturbative QCD Spin 
Studies”, hep-ph/9811279. 

[108] L.S. Kisslinger, hep-ph/9811497. 

[109] C. Itzykson and J-B. Zuber, “Quantum Field Theory” (McGraw-Hill Book Co., 1985). 



QCD Sum Rules 


793 


[110] C. D. Roberts and A. G. Williams, Prog.Part.Nucl.Phys. 33, 477 (1994), and references 
therein. 

[111] P. Tandy, Prog.Part.Nucl.Phys.39, 117 (1997) , and references therein. 

[112] A. Sharraa and A.N. Mitra, hep-ph/9707503; A.N. Mitra, hep-ph/9906288; Int. J. Mod. 
Phys. A14, 4589 (1999). 

[113] M.R. Frank and T. Meissner, Phys.Rev. C 53, 2410 (1996). 

[114] T. Meissner, Phys.Lett. B 405, 8 (1997). 

[115] L.S. Kisslinger and T. Meissner, Phys. Rev. C57, 1528 (1998). 

[116] G. ’t Hooft, Phys. Rev. Lett. 37, 8 (1976); Phys. Rev. D14, 3432 (1976). 

[117] E.V. Shuryak, Nucl. Phys. B203, 93; 116 (1982). 

[118] T. Schafer and E.V. Shuryak, Rev. Mod. Phys. 70, 323 (1998). 

[119] D.I. D’yakonov and V.Yu. Petrov, Sov. Phys. JETP 62, 204 (1985). 

[120] D.I. D’yakonov and V.Yu. Petrov, Sov. Phys. JETP 62, 431 (1985). 

[121] P.V. Pobylitsa, Phys. Lett. B226, 387 (1989). 

[122] H. Forkel and M.J. Banerjee, Phys. Rev. Lett. 71, 484 (1993). 

[123] M. Aw, M.K. Bannerjee and H. Forkel, hep-ph/9902458, to be published in Phys. Lett. B 
(1999). 

[124] L.S. Kisslinger, M. Aw, A. Harey and 0. Linsuain, hep-ph/9906457. 

[125] M.V. Polyakov and C. Weiss, Phys. Lett. B387 (1996) 841. 

[126] J.D. Walecka, Ann. Phys. 83, 491 (1974). 

[127] B.D. Serot and J.D. Walecka, Adv. in Nucl. Phys. 16, 1 (1986). 

[128] E.G. Drukarev and E.M. Levin, ZhETF Lett. 48, 307 (1988); Sov. Phys. JETP 68, 680 
(1989); Nulc. phys. A511, 679 (1990), A516, 715(E) (1990); Progress in Particle and Nuclear 
Physics, 27, 77 (1991). 

[129] T.D. Cohen, R.J. Furnstahl, abd D.K. Griegel, Phys. Rev. Lett. 67,961 (1991); X. Jin, M. 
Nielsen, T.D. Cohen, R.J. Furnstahl, and D.K. Griegel, Phys. Rev. C49, 464 (1994), gives 
references to other publications of this group. 

[130] E.M. Henley and J. Pasupathy, Nucl. Phys. A556, 467 (1993). 

[131] T.P. Cheng, Phys. Rev. D13, 2161 (1976). 

[132] X. Jin, Phys. Rev. C51, 2260 (1995). 

[133] L.S. Kisslinger and W. Wang, Phys. Rev. Lett. 30, 1071 (1973); Ann. Phys. (N.Y.) 99, 374 
(1976); A. Saharia, R. M. Woloshyn and L.S. Kisslinger, Phys. Rev. C23, 2141 (1981). 

[134] T. Hatsuda and S.H. Lee, Phys. Rev. C46, R34 (1992). 

[135] V.L. Eletsky and B.L. Ioffe, Phys. Rev. Lett. 78, 1010 (1997). 

[136] T. Hatsuda, H. Shiomi and H. Kuwabara, Prog. Theor. Phys. 95, 1009 (1996). 

[137] P. Braun-Munzinger and J. Stachel, Nucl. Phys. A638 (1997) 3c. 



794 


L S Kisslinger 


[138] T. Blum, et al, Phys. Rev. D51, 5133 (1995). 

[139] K- Geiger and D.K. Srivastava, Phys. Rev. C56, 2718 (1997). 

[140] A. L Bochkarev and M. E. Shaposhnikov, Nucl. Phys. B268, 220 (1986). 

[141] J.I. Kapusta, Nucl. Phys. B148, 461 (1979). 

[142] H. G. Dosch and S. Narison, Phys. Lett. B203, 155 (1988). 

[143] R. J. Furnstahl, T. Hatsuda and Su H. Lee, Phys. Rev. D42, 1744 (1990). 

[144] V. Gimenez, J. Bordes and J. Penarrocha, Nucl. Phys. B357, 3 (1991). 

[145] M.B. Johnson and L.S. Kisslinger, hep-ph/9908322 



30. Light-Front Dynamics 


V.A. Karmanov* 

Lebedev Physical Institute, Leninsky Prospekt 

53, 117924 Moscow, Russia 


Abstract 

The wave function in relativity is defined, in four-dimensional space, on a space-like three- 
dimensional plane. The plane, most close to the time-like region, is the light-front plane 
ct -h z ^ 0. Corresponding dynamical approach - the light- front dynamics - has considerable 
advantages. We describe, in a field-theoretical framework, the construction of light-front 
dynamics and illustrate it by some examples. 


1 Introduction 

A few centures ago Galileo Galilei has discovered that the rectilinear motion is indistinguishable 
from the rest. Two observers, the laboratory observer and the moving one, carring out the same 
experiments, obtain the same results. This discovery is deeply consistent with our intuition: the 
observer in an isolated laboratory does not interact with environment and, hence, he has no any 
way to learn about his motion. 

At the beginning of this century the existence of the limiting velocity was established. This 
is the light speed c. Nothing can move faster. This discovery was also very consistent with our 
intuition. Indeed, if the limiting velocity would not exist, a very far part of the Universe could 
make an immediate inluence to us. This seems unnatural. 

According to the Galilei principle, the limiting speed should be the same in any moving system 
of reference. Otherwise, the observer would be able to notice his motion, measuring speed of light. 
However, this seems paradoxal from point of view of our everyday experience. Pursuing the light, 
we can accelerate our system almost untill the light speed, but the light still runs away with the 
same speed c. 

Einstein discovered, that the Galilei principle is reconciled with existence of the limiting velocity 
because of change of properties of space and time in a moving system relative to the rest one. For 
both observers the space-time in their own systems is the same, but for the observer from the rest 
system the space-time in the moving system looks different than his own one. In particular, when 
the speed v of the moving system approaches to c, the laboratory observer notices that the clock 
in this system delays from his one. In its turn, the observer in the moving system sees the similar 
effect: from his point of view the time in the rest system delays and almost stops when his speed 
approaches to c. Not only the clock, but any physical process observed from the moving system is 
stopped as well. To describe the physical phenomena, the laboratory observer can use, naturally, 
his own clocks and the space scales. However, on his choice, he can use the clocks and the space 
scales from the moving system. Two systems are equivalent, but two descriptions are different. 
The dilation of time can be used, in a theoretical laboratory, to make the ’’instant photo” of a 
fast, subnuclear physical process. ’’Stopping” the time, i.e., stopping the process, one obtains big 
advantage for study the most fast processes proceeding with the speed close to c. This dependence 
on the choice of the reference frame, is, in other words, the dependence on the choice of the space- 
time coordinates. In different coordinates the dynamical description of a system is different. We 
get in this way the different forms of dynamics. 


Email: Email: kajrmanov@sci.lebedev.ni 



796 


V A Karmanov 


One of this form, the light-front dynamics (LFD), is very efficient tool to investigate the field 
theory and, in this framework, the relativistic composite systems (hadrons in the quark models, 
nuclei at relativistic relative nucleon momenta). In this article we will show, how LFD is con- 
structed, explain its most principal properties, its relations to other approaches and give some 
applications. There are also a lot of phenomenological applications of LFD. They are beyond the 
scope of the present paper. 

2 Forms of relativistic dynamics 

In his famous article [1] Dirac analysed three forms of dynamics: the instant form, the point form 
and the front one. 

Prom the group-theoretical point of view, the trasnformations of the system of reference in- 
cluding the translations, rotaions and the Lorentz transformations are forming the Poincare group. 
Under the infinitesimal transformation g of the coordinate system with the translation parameters 
Ufj, and with the four-dimensional rotation parameters 

the state vector is transformed as follows: 

^ ^ Uig)<f>, ( 1 ) 

where 

Ui9) = l+iP^a>‘ + ^J^,£^‘'. ( 2 ) 

Four translation generators are the operators of the four-momentum. Six generators of 
the rotaions and the Lorentz transformations are the operators of the four-dimensional angular 
momentum. The commutation relations between them have the form: 

[P^,P,] = 0, 

T[P/i7 “ QppPk ~ Sp-icPp J 

I 

” Qpp'^vi Qv'y^p'p 9pi^vp' (^) 

% 

The total amgular momentum of the system is determined by the Pauli-Lubansky vector: 

S^ = \€^„,yP''JO\ 

The state vector corresponding to a system with definite four-momentum mass M, total 

angular momentum J and its projection A to the z-axis satisfies the following system of equations: 

P^ = Pm , 

S^p-’^ip) = -M^ J{J + l)<t>^^{p) , 

S3^"(p) = MX<t>^^{p). (4) 

A particular dynamical system is determined by the explicit form of these generators, i.e., by 
a particular solution of the commutaion relations (3). IF these generators are expressed in terms 
of the particle coordinates, we get a version of relativistic quantum mechanics with fixed number 
of particles. If the denerators are expressed through the quantiun fields, we obtain a form of the 
quantum field theory. As soon as the generators are known, the state vector is determined by 
eqs.(4). For jm interacting system some Poincare generators contain the interaction. Namely, the 
generators changing the position of the smface, where the state vector is defined, contain interac- 
tion. The generators, which do not change the petition of the surface, don’t contain interaction 
and edndde with the generators of free system. Using this property, one can classify the different 
ftxms cS dynamics. 



Light-Front Dynamics 


797 


2.1 Instant form 

The laboratory observer studies the physical processes in the four-dimensional space-time contin- 
uum described by the coordinates x = The three-dimensional space r is a plane given by 

the equation t = const. The observer studies the evolution of his physical system from one plane 
t = const to other one. The wave function of a quantum system, for a given t, is defined 

on this (three-dimensional) plane. 

This description in four-dimensional space, from one equal-time plane to other one, correspond- 
ing to the different time instants t = const^ is called the instant form of dynamics. In our everyday 
life we always use the instant form. 

The time translations of the three-dimensional plane are determined by the Hamiltonian H = 
Pq. The interaction enters also into three operators of the Lorentz trasnformation JiQ, f = 1,2, 3. 
Indeed, two simultaneous events in one system of reference are not simultaneous ones in a mov- 
ing system. Therefore, the Lorentz trasnformations don’t leave the plane t = const invariant, 
they change the orientation of this plane relative to the time axis. This is the reason, why the 
corresponding generators contain the interaction. 

Oth^ six generators, the translations and rotations inside the three-dimensional space, namely, 
P and Ji = eijkP^ coincide with the generators of the free system. 

The instant form of dynamics is widely used for the relativistic generalizations of the quantum 
mechanics, 

2.2 Point form 

In principle, one can define the wave function not only on the plane, but on any space-like surface. 
Any two points of this surface can not be connected by the light signal and, hence, an event in 
one of these points cannot be a cause of the other one. A convenient choice is the surface of 
hyperboloid, p - =. const. Is is invariant under the Lorentz transformations. With the state 

vector defined on the family of these hyperboloids, we obtain the point form of dynamics. 

In the point form the rotations and the Lorentz transformations don’t change the hyperboloid 
_ y 2 _ Therefore all the six generators don’t contain the interaction. Whereas, 

the translations are much more complicated, and all the generators contain the interaction. 
This means that the total momentum of a system is not the sum of the particle momenta. This 
complicates the situation, inspite of the simplification of the Lorentz boosts. 


2.3 Front form 

The observer moving with the velocity v along z-axis describes a physical process in his coordinates 
which are related to the laboratory ones by the Lorentz transformations: 

z — vt 
>/l - V^/(P 
t + zvfc? 

y/1 

X, y' = y (5) 

According to (5), the plane t' = const in moving system corresponds to t + zv/c?' = const in the 
laboratory coordinates. The evolutions is considered from one plane t + z vjc^ = const to other 
one. Since the value of const is not yet specified, the factor l/yT-uV^ can be absorbed by it. 
For the "null plane” we put t' cct + zvl<? = 0. In the limiting case, when u -> c, we get the plane 
determined by the equation t' a = t + z/c = 0. The wave function is defined on this plane. 
This equation coincides with the equation for the light front z = -ct, moving along -z. This is 
the reason, why the description in these coordinates is called the front form of dynamics, or the 
light-front djmamics. 

We emphasize that there are two equivalents points of view on LFD. On the one hand, we 
can study the system in the instant form, i.e., at t' = 0, but from point of view of the system of 




798 


V A Karmanov 


reference moving with the limiting speed c. This system of reference is called^ the "infinite 
momentum frame". One can equivalently describe the same system in the "normal", laboratory 
frame, but in the light-front coordinates here z+ =t + z plays the role of the light- 

front "time", z- = t z is a coordinate in the light-front plane, and now we chose the unites with 
€ = 1. The first approach is more convenient for intuition, the second one is more appropriate for 
technical developments. The both differ from the instant form, t = 0, in the laboratory system. 
The both should give, in principle, the same results, as the instant form, but, as wee see, in more 
simple way. 

Prom the group-theoretical point of view, in the front form of dynamics only three generators 
P., Ji-., J 2 - do not leave the light-front plane invariant and contain the interaction. Other seven 
generators Pi , P 2 , , «/i 2 » > ^ 1 + *^ 2 + are the free ones. 

Note also that, for a free particle, the relation between the energy and momentum pi 
can be rewritten in the light-front coordinates as: p+p- - ^ = (with pi. = (pi,P 2 ))- So, the 
light-front energy p- of a free particle is expressed through the momentum as: 

p_ = — . 

This expression does not contain any square rooft, in contrast to the instant form. 

2A Why LFD? 

The main difficulty of the quantum field theory is the very complicated structure of the state 
vector describing the particles and even the state without any particles -- the vacuum state. The 
state vector is usually described as a superposition of the bare quanta, corresponding to the non- 
interating fields. If we "switch off” the interaction between the fields, the number of particles is 
conserved. As soon as we take into account the interaction, the state vector is a superposition of 
the states with different numbers of particles. 

If interaction is a weak, like in the case of the quantum electrodynamics, it does not change the 
state vector too much. Therefore, the "dressed” electron differs from the bare one only by small 
admixture of photon. 

The situation is drastically different, when the interaction is strong. In this case, the structure 
of the real particle is extremely complicated. For example, the proton consists of three quarks, 
but these quarks are not the same quarks that appear in the initial Lagrangian of the Quantum 
Chromodynamics (QCD). They are so called the constituent quarks, which, in their turn, consist 
of the bare quarks and the gluons. The state vector of the proton is a huge superposition of the 
bare fields. It has not yet been calculated from the first principles of QCD. 

One should emphasize that not only the proton state, but also the state without physical 
particles - the vacuum state, from point of view of the laboratory observer, is a complicated 
superpoatlon of the bare particles, or, in other words, of fluctuations of the bare fields. At the 
same time, this description of emptin^ in terms of the very complicated conglomerates of particles, 
seems unnatural. It would be much better to work in the approach, in which the vacuum is indeed 
nothing but emptine^. Simplifying the vacuum wave function, we simplify not only it, but also 
the wave function of the proton and of other particles, eliminating from them, like in the vacuum 
wave function, the fluctuations of fields. After that on can study the real, physical structure of 
parlicies. 

The mmmm is hut emptiness just in the light-front dynamics. This is the principal 

advantage of this sq)proach. 

Qualitatively this can be understood from point of view of the uncertainty principle for energy 
and time. Conslifer the fluctuation creating three particles from vacuum. The fluctuation with the 
aaergy LE = 4- may occur for the time At w hf AE (here eg = In the 

infinite moineitum frame the momenta ki and energi^ eg, of any particle increase, AE tend to 
infinity, TlM^rfore, the time of fluctuation At tends to zero. The contribution of this fluctuation 
to the vacuum wave function disappears. 



Light-Front Dynamics 


799 


This result is quite consistent with the mentioned above change of the space-time properties 
in the moving system. Due to the time dilation, all the physical processes are delaied, and the 
fluctuation has no time to occur. This means that in the thought experiment in the infinite 
momentum frame we study the particles prepared ”far in advance”, not spoiled by the vacuum 
fluctuations. 

As already emphasized above, one can directly formulate the theory in the light-front variables, 
without taking any infinite momentum frame limit. This formulation includes the rules of the 
graph techniques, which allow to calculate the amplitudes. In principle, they could contain the 
vertices corresponding to vacuum fluctuations. We will see below that in LFD these vertices do not 
appear. This is the quantitative manifestation of the of disappearence of the vacuum fluctuations. 
In LFD, the bare vacuum state, i.e., the eigenstate of the free Hamiltonian, is also an eigenstate 
of full Hamiltonian, containing the interaction. This property manifests itself in the formalism of 
LFD. 


2.5 LFD and relativistic quantum mechanics 

The dynamics of a nonrelativistic quantum system is determined by the Schrodinger equation with 
appropriate interaction Hamiltonian. Similar construction is developed for the relativistic quan- 
tum mechanical models. These models are based not on the field theory, but on a construction 
of relativistic phenomenological Hamiltonians in terms of the particle coordinates. The difference, 
in comparison to the nonrelativistic case, is in the fact that in the relativistic case the interaction 
enters in a few generators, so, we get a few ’’Hamiltonians”. For example, in the front form, the 
’’potential” is introduced in the generators P-, Ji^, J 2 ~. It has to be introduced by a selfconsistent 
way, since the generators should satisfy the proper commutaion relations of the Poincare group. 
In this scheme one can fit the phenomenological potential, for example, between two nucleons, and 
then describe the properties of two-nucleon system: the deuteron wave function, the electromag- 
netic form factors, etc. The approach is also generalized to the three-body case. One can find the 
details in the review papers and books [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. For the applications of the 
point form of dynamics to deep inelastic scattering see the paper [12]. 

Below in this article we concentrate on the field-theoretical approach in the framework of LFD. 
Many other detailes can be also found in the above review papers. 


2.6 Explicitly covariant LFD 

Together with big advantage of the simple vacuum structure, the light-front dynamics with the 
light-front plane = t + z = 0 has a disadvantage: the coordinates x, y and 2 ; appear in a non- 
symmetric way. Because of that the theory loses the explicit relativistic and rotational covariance. 
For example, in the pertubation theory, the amplitude in a given order is determined by sum of a 
few time-ordered graphs which differ from each other by the relative time order of the interaction 
vertices. The sum of them is covariant, but any particular term in this sum is not covariant. 
So, we deal with the theory, which provides, in principle, the covariant final results, but not the 
intermediate ones. Because of approximations, the covariance of the final results can be also lost. 

Inspite of this inconvenience, LFD is applied in many papers to QCD, to the hadrons in quark 
models and to the relativistic nuclear physics. The applications to the light-front QCD and other 
references can be found, in particular, in [10, 13, 14]. Note that in the paper [15] is was shown that 
the constituent quark picture with logarithmic confinement naturally appears in weak coupling 
light-front QCD. The applications to the relativistic composite systems (hadrons and nuclei) and 
the corresponding references can be found in the above review papers. The rules of the graph 
techniques for the light-front quantum electrodynamics, alternative to the Feynman ones, were 
developed in [16, 17]. It has been demonstrated that the light-front QED reproduces the results 
obtained in the Feynman approach (such as anomalous magnetic moment of electron, etc.). 

To avoid the inconvenience related to the absence of the covariance, the explicitly covariant 
version of LFD has been proposed [18] (see for review [11]). In this version the state vector is 
defined on the light-front plane of the general position, given by the equation cj-x = uot — oJ-f = 0, 



800 


V A Karmanov 


where u = (wo^ui) is a four- vector with = o/q — = 0. This is a generalization of the standard 

light-front approach. The latter corresponds to the particular value of a; = (1, 0, 0, -1). 

The co\ariance means that, for example, any four-vector can be transformed from one system 
of reference to other one by a standard matrix, which depends on the kinematlcal parameters only, 
relating two system of reference. Therefore, this matrix is one and the same for all the four- vectors. 

The absence of the explicit relativistic covariance in the standard version of LFD is related to 
the fact that the state vector depends dynamically on the orientation of the light-front plane. As 
mentioned above, the corresponding generators of these transformations contain the interaction. 
Rotating the system of reference, we rotate this plane. So, in the standard approach, with the light- 
front plane ^ 4- 2 = 0, there is no any universal kinematical transformation law for the light-front 
state vector. 

In the explicitly covariant version of LFD the kinematical transformations of the system of 
reference are separated from the dynamical transformations of the light-front plane. So, all the 
transformations of the reference system are kinematical ones. This restors the explicit covariance. 
At the same time, the dependence of the state vector on the orientation of the light front remains 
to be dynamical. This orientation is determined by the direction of the four- vector tv. The 
dependence of the state vector on the light-front orientation is now nothing but the dependence of 
the four-vector u. Therefore, the theory remains to be explicitly covariant. 

In this scheme one can construct two sets of the Poincare generators: (i) The generators respon- 
sible for transformations of the state vector under transformations of the reference system; they 
are kinematical and don’t contain interaction, (it) The generators responsible for transformations 
of the state vector under translations and rotations of the light-front plane; they are dynamical and 
contain interaction. The construction of these generators are given in Appendix. Group-theoretical 
aspects of the explicitly covariant LFD are clarified in the paper [19]. 


3 S-matrix 

In the instant form, the S-matrix S(“-oo, t) gives the time evolution of the wave function, defined 
at £ = — 00 , to the time t The S-matrix 5(~oo,-l-oo) gives the scattering amplitude. In LFD, 
this evolution takes place from one light-front plane to other one, in the direction of the light-front 
time. 

As usual, the S-matrix is derived from the time-dependent Schrodinger equation in the ”inter- 


action representation”: 



(6) 

whare 


f H'”^ix,t)rx 

(7) 


is the interaction Hamiltonian, H^^{x) = is the Hamiltonian density. We consider the 

example of the self-intaracting scalar field: = -gip^{x). In the interaction representation 

the fidd ^(a:) is the free fidd: 


~ (2g-)3/2 / exp(-*A-x) + a^(r)exp(tAr-x)j . (8) 

a are the creation and annihilation operators satisfying the commutaion relation 

The S-matrix is obtained as the formal solution of (6): 

5 = Texp [-t I 


( 9 ) 



Light-Front Dynamics 


801 


The T-product orders the operators in the ordinary time t. The perturbation theory is obtained 
by decomposing (9) in series in the degrees of the coupling constant. One may put in correspon- 
dance, to any given term, a Feynman diagram and calculate the corresponding amplitude by the 
standard Feynman rules. In this way, the Feynman propagators appear as the average value, over 
the vacuum state, of the T-product: 

G{x - x') = i < 0\T{(p{x)(p(x‘))\0 > . 

Its Fourier transform is just the Feynman propagators: 


— = —2 J G(x) exp{ipx)d^Q 


Another way to calculate the 5-matrix is to develop the time-ordered perturbation theory. For 
this aim, following to [20, 21] (see for review [22]), one should replace in (9) the time-ordering 
operator T by the explicit time ordering. Namely, one can represent (9) as: 

5 = 1 + .. . ..d^Xn- 


In this way, the Feynman propagators are replaced by the average values of the product of the 
operators 0|c^(a;)^(x')|0 >. There is no any T-product here, since it is taken into account by the 
theta-functions. In the momentum space, with 

j (p{x) exp{-ik-x)d^x = [a{-k)0{-ko) + {k)0{ko)]^/^S{k^ - m^) 

this results in the contraction: 

V(^(p) = <p{k)'pip)- ■.ip[k)(p(p ) ; = d{p(i)5{j?' - Tn?)S^^'^(p + k) . (11) 

We would like to emphasize that the propagator (11) contains the delta-function 5{p^ — m^), and 
therefore in the time-ordered graph techniques all particles are always on their mass shells. It 
is convenient to replace in the following 6{pq) in the propagator (11) by 6(lJ‘p). This is always 
possible, since p^ = > 0. 

This method results in the so called old fashioned perturbation theory. The amplitudes are 
represented by the time-ordered graphs. Instead of the Feynman propagators, they contain in the 
denominators the energies between the initial and intermediate states. The detailed derivation 
for arbitrary space-like plane is given in [20, 21, 22]. Namely, in the paper [20] the state vector 
is considered as evolving on the family of planes A-x = cr, where A = (Ao,A), A^ = 1. The old 
fashioned perturbation theory is obtained from the graph techniques developed in [20, 21, 22] as a 
particular case at A = (1, 0). The same method is applied to the case of ordering in the light-front 
time and gives the amplitudes in LFD. Below namely the latter case will be considered in detail. 
Here we illustrate in a simple example the difference between the Feynman and the usual time- 
ordered amplitudes. The amplitude for exchange by the particle in s channel can be represented 
in two different forms. The Feynman amplitude is: 

M = 

— {k+ p)^ 

It corresponds to two terms in the old fashioned perturbation theory: 


M = Ma + Mf, = 


[sjfe+p- - £j; - ep] - [e,-^ - + e^ + e^] 


( 12 ) 



802 


V A Kaxmanov 


Two items in (12) correspond to two time-ordered graphs, the second one arises from the vacuum 
fluctuation. It disappears in the infinite momentum firame (since AS = ejg + £p + £k+p ^ oo) and 
in the light-front dynamics (see below). 

Now consider the graph techniques, which is ordered in the light-front time. As mentioned, the 
LFD Hamiltonian is defined on the light-front plane uj-x = cr, <t is the light-front time. Therefore, 
in the case of the scalar fields, the integral over (fx in (7) is replaced by the integration over the 


light-front plane; 


= j W^\x)6iuj-x - a)d^x, 


(13) 


The S-matrix still has the form (10), but now the T-product orders the operators in the direction 

of oj: 

(14) 


5 = r„exp -ij S‘'“(i)d"a: 


The expression (14) is then explicitly represented in terms of the light-front time a = uj-x. 
Instead of (10) we get: 


5 = 1 + E / (w-(Xl - X2)) ...6 {uj-{Xn-l - a;„)) 

xd^xi . . . ( 1 ^) 

The index uj at indicates that and may differ from each other in order to provide 
the equivalence between (9) and (14), The region where this can happen is a line on the light cone. 
Indeed, if (xj - ^ 2 )^ > 0, the signs of oj-{xi ~ X 2 ) and ti — 12 are the same and hence 
E (xi - X 2 )^ < 0, the operators commute: 

[ff-^(xi),H'"‘(x2)] = 0, 


and their relative order has no significance. On the light cone, i.e. if {xi — ^ 2 )^ = 0, a;*(xi — X 2 ) 
can be equal to zero while ti - t 2 may be different from zero, li the integrand has no singularity 
at (xi ~ X 2 )^ = 0, this line does not contribute to the integral over the volume d^x. However, 
if the integrand is singular, some care is needed. To eliminate the influence of this region on 
the 5-matrix, we have introduced in (15) a new Hanultonian such that expressions (9) and 
(15) be equal to each other. The form of which provides this equivalence, depends on the 
singularity of the commutator at (xi — X 2 )^ = 0. For the scalar fields, the singularity is weak 
enough, and the expressions (9) and (15) are the same, so that For fields with spins 

1/2 and 1 or with derivative couplings, the equivalence is obtained with differing from if*”'* 
by an additional contribution (counter term) leading to the contact terms in the propagators (or 
so called instantaneous interaction) [11]. 

Introducing the Fourier transform of the Hamiltonian: 

-ffw(p) = J H^^{x)ex:p{-ip-x)(l^x , (16) 

and using the integral representation for the 0 function: 



exp {iTojjxi - X 2 )) 
T — ie 


dr , 


(17) 


we can transform the expression (15) to the form: 
5 = l-ii^(O) 


(18) 

The r variable app^rs here as an auxiliary variable, as defined in eq.(17); ujt has the dimension 

of a momentum. 



Light-Front Dynamics 


803 


3.1 Spin 0 system 

Below we still restrict ourselves by the example of the simple interaction Hamiltonian of the form 
H = ~-g(p^{x). The covariant light-front graph technique arises when, as usual, one represents the 
expression (18) in normal form. 

The four- vectors ujtj in (18) are associated with a fictitious particle - called spurion - and 
the factors l/{Tj - ie) are interpreted as the propagator of the spurious responsible for taking the 
intermediate states off the energy shell. This spurion should be interpreted as a convenient tool 
in order to take into account off-energy shell effects in the covariant formulation of LFD (in the 
absence of off-mass shell effects), and not as a physical particle. It is absent, by definition, in 
all asymptotic, on-energy shell states. We shall show below on simple examples how the spurion 
should be used in practical calculations. 

The general invariant amplitude Mnm of a- transition m n is related to the 5-matrix by: 


((27r)32£*, . . . (27r)32£fc; i2w)^2ek, . - . (27r)32£*„) 


1/2 


where, e.g., ^ The cross-section of the process l-H2->34--.. + nis thus expressed 


4jejbi£jfe2 (27r)22sjk3 (27r)^2efc, 

where j is the flux density of the incident particles: 


-5^^\k^ + k2-kz- ...-kn) , 


iSkiSki = ^[s - (mi + - (mi - s = (ki + kzf . 



Fig.l. The vacuum vertices. 


To find the matrix element M of order n one must proceed as follows [20, 21, 22, 18, 23]: 

1. Arbitrary label by a number the vertices in the Feynman graph of order n. Orientate con- 
tinuous lines (the lines of physical particles) in the direction from the smaller to the larger 
number. Initial particles are oriented as incoming into a graph, and final particles as outgoing. 
Connect by a directed dashed line (the spurion line) the vertices in the order of decreasing 
numbers. Diagrams in which there are vertices with all incoming or outgoing particle lines 
(vacuum vertices, as indicated in fig. 1) can be omitted. Associate with each continuous line 
a corresponding four-momentum, and with each j-th spurion line a four-momentum uTj. 

2. To each internal continuous line with four-momentum fc, associate the propagator 0{u;‘k)5{k'^ - 
m^), and to each internal dashed line with four-momentum uTj the factor l/{rj - ie), 

3. Associate with each vertex the coupling constant g. All the four-momenta at the vertex, in- 
clnding the spurion momenta, satisfy the conservation law, i.e., the sum of incoming momenta 
is equal to the sum of outgoing momenta. 



804 


V A Karmanov 


4. Integrsitc (with <£‘^kl over those four-momeiita. of the internal particles which remain 
unfixed after taking into account the conservation laws, and over all tj for the spurion lines 
from -00 to oo. 

5. Repeat the procedure described in 1-4 for all nl possible numberings of the vertices. 

We omit here the factorial factors that arise from the identity of the particles and depend on 
the particular theory. 

The important property of LFD - the disappearence of the vacuum fluctuations is just the 
disappearence of the vacuum vertices indicated in fig. 1. In this formalism they disappear for a 
trivial reason: it is impossible to satisfy the four-momentum conservation law for them. Indeed, 
the conservation law for the vertex of fig. 1 has the form ki + k 2 + ks = uj{ti - T2). Since the 
four-momenta are on the mass shell: ^1-3 = > 0, so that the left-hand side is always strictly 

positive: {ki + A:2 + > 8m^, whereas the right-hand side is zero since = 0. However, it will 

be seen that the vacuum contributions that vanish in the light-front approach leave their track in a 
different way, making for the fields with spin the light-front interaction Huj{x) in eq.(lO) different 
from the usual interaction H{x) in (9). 

The case of the particles with non-zero spins is considered in [11]. In this case, the vacuum 
fluctuations disappear too, but some additional (contact) vertices appear, due to the difference 
between F*”* and They are also taken into account by the rules of the graph techniques. 

We emphasize that despite the presence of the four- vector uj in eq.(18), the amplitudes cal- 
culated in this way are explicitly covariant. We just obtain the theory with separation of the 
kinematicaJ dependence of amplitudes on the reference system and of the dynamical, but covariant 
dependence on the light-front orientation. The full 5-matrix and any physical amplitudes do not 
depend on u, since eq.(18) gives the same S-matrix, as the initial one, given by eq.(lO). How- 
ever, off-shell amplitudes depend on uj and off-sheU light-front amplitudes don’t coincide with the 
Feynman ones. We will see below, that the wave functions also depend on uj. 

The light-front diagrams can be interpreted as time-ordered graphs. As soon as the vertices 
are labelled by numbers, any deformation of a diagram changing the relative position of the vertex 
projections on the “time direction” from left to right does not change the topology of the diagram 
and the corresponding amplitude. Therefore it is often convenient to deform the diagram so 
that the vertices with successively increasing numbers are disposed from left to right. This just 
corresponds to time ordered graphs. In addition, this graph technique is three-dimensional one, 
i.e., the four-momenta of the particles, even in the intermediate states, are always, on the mass 
shells, all the integrations over the internal momenta are three-dimensional ones. 

The light-front amplitudes can be also obtained from the graph techniques [20, 21, 22] with A = 
(Xq, A), A^ = 1 as follows. One should replace A -> X^/S with A^ = 5^ and take limit 5 0. This 

just corresponds to the infinite momentum frame limit of the old-fashioned perturbation theory. 
The light-front amplitudes can be also obtained by direct transformation of a given Feynman 
amplitude {24, 25]. 

By a repiacemet of variables [11] the covariant light-front amplitudes can be transformed to the 
form of the ordinary light-front diagrams corresponding to u; = (1, 0, 0, —1), given by the Weinberg 
rules [26]. 

3.2 Why time-ordered graphs? 

Deriving both the Feynman graph techniques and the time-ordered one, we proceed from one and 
the same expression (9) for the S-matrbc and therefore we obtain the same amplutude in a given 
order of the perturbation theory. The important difference between two approaches appears in 
dmcribing the bound states, and, in general, the state vector. In the Feynman approach the boimd 
sIMm. are descritel by the Bethe-Saipeter functions [27], which are defined as: 

#(xi,X2,p) = {0\T{(pixi)(p{x2))\p). (21) 

Here is Ite Hdseaberg operator. The Bethe-Salpeter function depends on two four-vectors 
they induite two times Hi the momentum space the Bethe-Salpeter function looks as: 



Light-Front Dynamics 


805 


# = Their arguments /i,2 are off mass shell: Ij 9*^ ^ Though it satisfies the 

normalization condition, allowing to find the normalization coefficient, the Bethe-Salpeter function 
has no any probabilistic interpretation (see for review [28]). 

The time-ordered approach describes the bound states by means of the Fock components. It 
allows to express the amplitudes in terms of the Fock componets of the state vector. The latters 
are the direct relativistic generalization of the non-relativistic wave functions. They depend on the 
on-mass-shell four-vectors and have the same probabilistic interpretation, as the non-relativistic 
wave functions. The kernel of the equation for the wave function can be calculated by the rules of 
the graph techniques. The time-ordered graphs give also the space-time picture of the process. 

The transparant physical interpretation, clear nonrelativistic limit and also comparatively sim- 
ple three-dimensional calculating formalism are the advantages of this approach. 

The relation between the light-front wave function and the Bethe-Salpeter amplitude is given 
below in sect. 4.5. 

3.3 Simple examples 

3.3.1 Exchange in t-channel 



k'2 1 k2 k'2 2 k2 


Fig.2. Exchange by a particle in t-channel. 

Consider two time-ordered diagrams shown in fig. 2. They correspond to the exchange of a 
scalar particle of mass fjL between two scalar particles, in the t channel. These diagrams determine, 
in the ladder approximation, the kernel of the equation for the calculation of the light-front wave 
function. The external spurion lines indicate that the amplitude is off-energy shell. The term 
” off-energy shell” , is borrowed from the old fashioned perturbation theory, where it means that for 
an amplitude which is an internal part of a bigger diagr^, there is no conservation law for the 
energies of the incoming and outgoing particles (like in the intermediate states in the amplitudes 
(12)). For the light-front amplitudes shown in fig. 2, for u; = (1,0,0, —1), there is no conservation 
law for the minus-components of the particle momenta, i.e., for the ’’light-front” energies. This 
momentum nonconservation is just taken into account by spurion. 

According to the light-front graph technique for spinless particles, the amplitude has the form: 

/C = 9 ^ f 0 (a;*(Ai - k[)) 5 {{ki - -h cun - urf - /i^) - 

j ' n —z€ 

-h 9 ^ f 6 {iJ^{k[ — ki))S {{k[ ~ Ai *f wri — ur'Y — /x^) 

J 

y? - (*i - + 2ra;-(A:i - k[) - ie 

g‘^6{u}ik{-kt)) 

pfi — {kf^— kiY + 2T'u}-(ki - ki) — ie ' 


( 22 ) 



806 


V A Karmanov 


The two items in ( 22 ) correspond to the two diagrams of fig. 2 . They cannot be non-zero simulta- 


neously. On the energv' shell, i.e. for both r = 
the Feynman amplitude: 

fCir = r' = 0) = 


r' = 0 , the expression 


/i 2 ~ (ki ~ kif ~ ie ' 


for the kernel is identical to 
(23) 


Note that the off-shell amplitude ( 22 ) depends on lj. 

On the energ^^ shell, corresponding to r = r' = 0, the dependence of the amplitude on cj 
disappears. In more complicated cases, when a Feynman diagram corresponds to the sum a few 
light-front diagrams (like in the case of the box diagrams considered in sect. 6 below), the amplitude 
for a particular light-front diagram may depend on uj even on the energy shell. This dependence 
disappears in the sum of all amplitudes in a given order. In this case the singularities of different 
amplitudes, related to their dependence on a;, cancel each other in the sum. 

The dependence of the perturbative amplitude ( 22 ) on the light-front orientaion (calculated 
exactly in the order) indicates that the light-front wave function, being the off-shell object too, 
also depends inevitably on the light-front orientaion (see sect. 4 below). 


3.3.2 Self-energy contributions 



P - £0 + C0X| . k 

Fig.3. Self-energy loop. 


Another simple example is the self-energy diagram shown in fig. 3 . The corresponding ampli- 
tude (equal to the self-energy up to a factor) has the form: 

W) J S{uj-k)5{e - m^)d (w(p' + WTI - k)) S ((p' + UJTI - kf - m^) , (24) 

with |/ =p-a;r'. 

Let 9 = p' + WTi- The integral over is thiis reduced to the well known calculation of the 
imaginary part of the Feynman amplitude, when all the propagators are replaced by the delta- 
functions: 

- m^)S ((9 - kf - m^) d^k = — ^i/ 92 - 4 m 2 . 

Inerted in (24), it gives: 


4771 ^— 

pe ^arithmic divergence is at the upper limit of the integration over n . One can introduce the 
invariant cutoff in terms of ti. In this way, after renormalization, the standard expression for the 

self-aaergy amplitude is obtained. 

1 -^?^ ^ ^ particular manifestation of a general property of the 

Upfront ampUtudes. A peculiatity of the covariant light-front amplitudes is that they have no 


- 4m2 4- n dri 

+ n n-ie' 


(25) 



Ligbt-Front Dynamics 


807 


any ultraviolet divergences for the finite values of all the spurion four*momenta. All the ultraviolet 
divergences in all the light-front diagrams appear after integrations over tj in infinite limits [20]. 
Indeed, the energy-momentum conservation (including the spurion four-momentum) is valid in any 
vertex. Since all the four-momenta are on the corresponding mass shells, we have at each vertex 
a real physical process as far as the kinematics is concerned. For finite initial particle energies 
and for finite incoming spurion energy, the energies of the particles in the intermediate states are 
thus also finite. Hence, the integrations over the particle momenta for fixed spurion momenta are 
constrained by a kinematically allowed finite domain. It is the same reason that provides finite 
imaginary part of a Feynman diagram found by replacing the Feynman propagators yp V — ^ 

by the delta-functions —i7r5{k‘^ - m^). In both cases the internal particle lines are associated with 
the delta-functions. 


The only source of the ultraviolet divergences in the light-front amplitudes is the infinite in- 
termediate spurion energies, i.e., infinite tj. This is the reason why divergences may appear at 
the upper limit of integration over rj. Since tj are scalar quantities, one can introduce an invari- 
ant cutoff in terms of these variables. This way of regularizing the divergent diagrams is another 
advantage of the covariant formulation of LFD. 

For the massless particles, the light-front amplitudes may have infrared divergences, like in the 
case of the Feynman diagrams. 

Another peculiarity of LFD is the appearance of “zero modes”. For constituents of zero mass, 
for instance, the state vector may contain components with = 0 for non-zero four-momentum 
k. In the standard approach, this corresponds to the finite light-front energy k^ = k\/kj^ for both 
and k\ = 0. Zero modes can also appear in theories with spontaneously broken symmetry. 
They make the equivalence between LFD and the instant form of quantization in which nontrivial 
vacuum structures (condensates) appear [10, 29, 30, 31). 

The detailed discussion of these important problems is beyond the scope of the present paper. 


4 Light-front wave function 

As already mentioned, the wave functions are the Fock components of the state vector defined on 
the light-front plane uj^x = 0. This means that they are coeflScients in an expansion of the state 
vector \p > with respect to the basis of free fields: 

|p)„ = (f>^(p)=(2irf/^jil}2iki,k2,p,u)T)a*{ki)a^{k2)\0) 

X + fa - p - ur)2(^-p)dr 

The dots • • • include the higher Fock states. For simplicity, we omit the spin indices. 

We emphasize in (26) the presence of the delta-function <5^^^ {ki+k 2 —p- ujt). This gives the 
conservation law: 

= p + cjT . (27) 

In the particular case where u; = (1,0,0, -1), the delta-function +k 2 -p - wr) gives the 

standard conservation laws for the (±, -f )-components of the momenta, but does not constrain the 
minus-components. 

From (26) one can see that the wave function depends on a;r, i.e., on the orientation of the light 
front. This important property of any Fock component is very natural. As explained above, any 
off-energy shell amplitude depends on the light-front orientation (see eq.(22)). The bound state 
wave function is always an off-shell object (r ^ 0). Therefore it also depends on the orientation 
of the light-front plane. This property is not a peculiarity of the covariant approach. At the same 
time, the description of the off-energy shell effects in terms of the external spurion lines allows to 
parametrize this dependence explicitly. 



808 


V A Kaxmanov 


4.1 The relativistic relative momentum 

We will mainly concentrate on the two-body wave function. Generalization to the n-body case is 
straightforward and is given in [11]. 



Fig.4. Graphical representation of the two-body wave function on the light front. The broken line 
corresponds to the spurion (see text). 


Due to the conservation law (27), the light-front wave function can be shown graphically like a 
two-body scattering amplitude as indicated in fig. 4. The broken line corresponds to the fictitious 
spurion. 

Due to this analogy, the decomposition of the wave function in independent spin structures and 
their parametrization is analogous to the expansion of a two-body amplitude in terms of invariant 
amplitudes. We will use this analogy below. We emphasize again that although we assign a 
momentum wr to the spurion, there is no any fictitious particle in the physical state vector. The 
basis in eg. (26) contains the particle states only. 

The relativistic relative momentum k has the same sense as the norelativistic one: it is the 
momentum of on of the particle in the c.m.-system where fci *f = 0. Note that due to the 
conservation law (27) ^ the total momentum 0 of the system in this ref erence frame is not zero. 
This definition of the relative momentum does not assume, however, that we restrict ourselves 
to this particular reference frame. In the arbitrary system of reference the relative momentum is 
constructed by the Lorentz transformation to the system moving with velocity 

V = P/Pq^ where ^ = ^11 + ^ 2 =^ + 


We get: 


k=L-^iP)ki=ki 


P_ 

M 



krP 

M-hPo 


(28) 


L'^^(P) is the Lorentz boost, M = ^/P^. Similarly we define the unit vector n in the direction of 
(3 in this system: 

n = L^^iP)6j/\L^\p)6j\ = ML'^\r)wMp . (29) 

Ftom these definitions, it follows that under a rotation and a Lorentz transformation g of the 
four- vectors from which k and n axe formed, the vectors k and n undergo only rotations: 


P = H(p,P)fc, W^Rig,P)n, 


where R is the rotation operator: 


Rig, p) = L~^ {gp)gL(p) . (30) 

Ttesefare k ^ aad n-k are the rotation and the Lorentz invariants. For the wave function with zero 

an^ilar SMMieatiim we thus obtain [ 18 ]: 

^ = ^(£,n) = . 


(31) 



Light-Front Dynamics 


809 


It is seen from (31) that the relativistic light-front wave function depends not only on the relative 
momentuna k but on another variable - the unit vector n. 

In the case of the states with non-zero angular momentum, the angular momentum is con- 
structed by means of the spherical functions depending on the arguments k and n. 

We introduce another set of variables in which the wave function can be parametrized, in 
analogy to the equal-time wave function in the infinite momentum frame. We define the variables: 


X = uj'kiloj-p , Ri = ki — xp ^ (32) 

and represent the spatial part of as -1- R± , where JRjj is parallel to u and Rx is orthogonal 

to oj. Since R-lj = RqUq - Ryu = 0 by definition of R, it follows that Rq = and, hence, 
is invariant. Therefore, R\^ and x can be chosen as two the scalar arguments of the 
wave function: 

ip = iji{R\,x) . (33) 

Using the definitions of the variables Kj_ and x, we can readily relate them to P and n-k: 


R^ = - {fi'k)^, X 



The inverse relations are 


2 _ ^ 
4x{l - x) 


n-k = 


+ /I 

x(l—x) \2 



(34) 


(35) 


The variables introduced above can be easily generalized to the case of different masses and an 
arbitrary number of particles [23]. The corresponding variables qi,n are still constructed according 
to eqs.(28), (29) and the variables Ri±,Xi according to (32). 


4*2 Normalization 


The state vector is normalized as: 

A'Ip, A)„ = 2po p‘) . (36) 


The Fock components are normalized so as to provide the condition (36). Substituting the state 
vector (26) in the left-hand side of eq.(36), we reproduce the right-hand side if 
where is the contribution to the normalization integral from the n-body Fock component. 
For the state with zero total angular momentum the normalization condition has the form: 




In this case, the two-body contribution to the normalization integral reads: 


N2 




£k _ 1 

£* ~ (27r)3 



cpR±dx 
2x{l — x) * 


(37) 


(38) 


This normalization integral gives contribution only of the two-body wave function to the sum 
(37). The contribution of other sectors can be taken into account by the integral: 


1 

(27r)3 


/ 


(fk (fk' 


ip*{k',n) 


£kS{k- k') - 


4m3 dV{k',k,n,M^) 
{2rf 5M2 


V>(fe,n) = 1 , 


(39) 


where V{k ',k,n, M^) is the kernel of the equation for the wave function. The second term accounts 
for the many-body contribution to the norm, J 2 n >2 



810 


V A Karmanov 



Fig.5. Equation for the two-body wave function. 



4.3 Equation for the wave function 

The equation for the wave function is obtained from the equation for the vertex part shown 
graphically in fig. 5. 

It is the analogue, for a bound state, of the Lippmann-Schwinger equation for a scattering state. 
Let us first explain its derivation for the case of spiniess particles. In accordance with the rules 
given in sect. 3.1, we associate with the diagram of fig. 5 the following analytical expression: 

r{Jfci,fc2, P,wr) = J T{k[,k^,p,LOT')6iij-k[)6ik'^ - m^)eiu-k^)Sik'i - m^) 

dr' k' 

X k2- p- ijJT')d^k[K.{k{,k^,ujT'-,ki,k2,ujT)— — ■ 

* . ( ZTT ) 

(40) 


Here T is the vertex function and the kernel /C is an irreducible block. The latter is calculated 
directly by the graph technique once the underlying dynamics is known. We should then express 
the vertex T through the two-body wave function. This can be done by comparing, for example, 
two ways of calculating the amplitude for the breakup of a bound state by some perturbation: 1) 
by means of the graph technique (the result contains T); 2) by calculating the matrix element of 
the perturbation operator between the bound state and the free states of n particles (the result 


contains ^). We thus get: 


ljj{ki,k2,p,OJT) 


r(fci,fc 2 ,p,a;r) 

S-M2 


(41) 


where s = (&i + ^ 2 )^ = (p + wt)^. The corresponding relation for the n-body case has the same 
form. In any practical calculation of the amplitude, we associate F with the vertex shown in fig. 4 
and then express F in terms of ip by eq.(41). 

In the_^sunple case of a scalar particle, the equation for the wave-function in terms of the 
variables k, n has the following form: 


{4{P -1- m^) - M=) xPik,n) = j ip{k\ n)V{k', k,n, . (42) 

An equation of such a type was also considered in refe, [32, 33, 34, 35, 36, 37]. 

In the non-relativistic limit, equation (42) turns into the Schrodinger equation in momentum 
space, the tend V being the non-relativistic potential in momentum space, and the wave function 
no longer depends on n. 

We emphasize that the wave function, which is an equal-time wave function on the light front, 
turns into the ordinary wave function in the non-relativistic limit where c oo. This reflects the 
fte dial in the nmi-relativistic limit two simultaneous events in one frame are simultaneous in all 
otha: fram^. 



Light-Front Dynamics 


811 


In the variables R± and x, eq.(42) can be rewritten in the form: 


f R\ + vn? 
^ x(l - x) 


mA rPiR^,x) = I tp{R'^,x’)V{R'j_,x'-,Rj^,x,M’^ 


SR'^dx' 

2x'{l-x') 


(43) 


In this form, this equation is nothing else than the Weinberg equation [26]. 

The advantages of the equation for the wave function in the form (42) compared with (43) are 
its similarity to the non-relativistic Schrodinger equation in momentum space, and its simplicity 
in the case of particles with spin. These properties make eq.(42) very convenient for practical 
calculations. 

The kernel of eq.(42) depends on the vector variable n. We shall see that this dependence, 
especially the part which depends on is associated with the retardation of the interaction. 
From this point of view, the dependence of the wave function rp{k,n) on n is a consequence of 
retardation. 


4.4 The Wick«Cutkosky model 


As a simple example, we shall derive in this section the light-front wave function of a system 
consisting of two scalar particles with mass m interacting through the exchange of a massless 
scalar particle. The kernel is calculated in the ladder approximation. This is the so-called Wick- 
Cutkosky model. The diagrams that determine the kernel are shown in fig. 2. The kernel 1C is 
given by eq.(22) with /x = 0. Going over from the kernel 1C to V = -K/{Arn^)^ introducing the 
constant a == p^/(167rm^), and expressing (22) by means of the initial and final relative momenta 
ft, P, we obtain [38]: 

7 = (44) 


where 


= {k' - kf - {fi-k'){n-k)^^^^—^ + (e|, +e| - \m^) 

Sk'^k 2 


n-A' 


n-k 

Sk 


(45) 


For k, k' m, eq.(44) turns into the Coulomb potential in momentum space 


V{k',k) ~ - 


47ra 

(fc' - ■ 


(46) 


For a < 1, Icfcl = \M —2m\ = ma^/A < m, the wave function is concentrated in the non-relativistic 
region of momenta. The non-relativistic wave function of the ground state in the Coulomb potential 
has the form: 

^ ^ (jfc2 + K2)2 


where k = y/m\eb\ = ma/2. It is normalized, however, according to (38) with s* w m and N 2 = 1. 
The integral over cPk' in (42) is concentrated in the region k' « k. Therefore, &t k k the 
momentum k' in V(k',k,n,M^) can be ignored, and from (42) we find; 


i>ik,ft) = - 


mV (0, k, n, M^) 

(27r)®(fc2 + 



(48) 


Substituting in the r.h.s. of eq.(48) the expressions (44,45) for V and (47) for t}), we obtain 


i!{k,n) = 



(49) 


This relativistic wave function of the ground state with zero total angular momentum is a good 
approximation of a more exact one in the range k > k. Corrections of order a log(Q!) should be 



812 


V A Karmanov 


considered in the range k < k (see [39]). Though the kernel (44), (45) contains the modulus 
In^k ' “ f^’k/€k\^ one can show that the exact solution of (42) has no ‘‘cusp” at n-k = 0. This 
cusp in (49) appears due to our approximations. 

One can check in this simple example that it is the retardation of the interaction that is the 
dynamical reason for the dependence of the wave function on the \^iable n. The non-relativistic 
Coulomb expression for the kernel (46) does not contain retardation and does not depend on n while 
the relativistic kernel (44) contains retardation and depends on n. This leads to the dependence 
of the wave function on the argument n. 

The retardation leads to both the n-dependence and the presence of the carriers of the inter- 
action in the intermediate state, which contribute to the many body sectors. However, these two 
effects, being important in full measure in a truly relativistic system, can manifest themselves in 
a different way in weakly bound systems. Neglecting the many-body sectors does not necessarily 
entails to neglect the n-dependence of the wave function at k ^ m. It is necessary to take into 
2 iccount the n-dependence of the w^ave function even ivhen one restricts to the two-nucleon sector. 

We emphasize that the dependence of the w^ave function (49) on n does not mean any violation 
of the rotational inv^ariance. As explained above, it reflects the dependence (unavoidable one, in the 
field-theoretical framew^ork) of any off-energ>^ shell amplitude on the orientaion of the light-front 
plane. At the same time, the on-shell amplitude expressed through the wave function should not 
depend on n. For the case of electromagnetic form factor this property is discussed below in sect. 
5. 

The wave function of the 2p state can be foimd analogously. In the system where -f = 0 
it has the form [38]: 






V6 


x^kYix{k/k) + Y,x{ii) 




(50) 


The wave function corresponding to the angular momentum I = 1 contains the spherical function 
Yix{n), This is an illustration of the fact that the vector n participates in the construction of 
the total angular momentum on the^same ground as the relative momentum k. The dynamical 
difference between the solution with k\\n and fe X n is obviously related to the property that some 
of the components of the angular momaitum J, before using the angular condition, depend on the 
interaction. 


4.5 Relation with the Bethe-Salpeter function 

It is instructive to compare the solution (49) with one found using the Bethe-Salpeter function. 

we find the relation between the light-front wave function and the Bethe-Salpeter function. 
We ^uld start from the integral that restricts the variation of the arguments of the Bethe-Salpeter 
fimctk>n to the light-fix>nt plane: 

1 = j d^xi d^X2 5{u-X2) Hxi,X2,p)exp{iki-xi + ik2-X2} , (51) 

wbere kuk 2 are the on-sheU momenta: k^ = = m^, and ^xi,X 2 ,p) is the Bethe-Salpeter 

functkm [27], eq.(21). We represent the d-functions in (51) by the integral form 

5(uj-x) = i e!xp(—iuj-xa)da, 

introduce tl^ Fourier transform of the Bethe-Salpeter function 

Hxi . X 2 ,p} = (27r)-3/2 exp [-ip-{xt + X2)/2] §(x,p) , x = xi-X 2 , 



Light-Front Dynamics 


813 


^ihp) = J ^{x,p)exp{il‘x)d^x , 

where / = — / 2 )/ 2 , p = Zi + Z 25 and I 2 are off-mass shell four-vectors, and make the change of 

variables ai + a 2 = r, (02 - ai)/2 = /3. 

On the other hand, the integral (51) can be expressed in terms of the two-body light-front 
wave function. We assume that the light-front plane is the limit of a space-like plane, therefore 
the operators (p(a:i) and (p{x 2 ) commute, and, hence, the symbol of the T product in (21) can 
be omitted. In the considered representation, the Heisenberg operators (p{x) in (21) are identical 
on the light front uj-x = 0 to the Schrddinger operators Qust as in the ordinary formulation of 
field theory the Heisenberg and Schrodinger operators are identical for t = 0). The Schrodinger 
operator (p{x) (for the spinless case for simplicity), which for uj-x = 0 is the free field operator, 
is given by (8). We represent the state vector \p) = (j>{p) in (21) in the form of the expansion 
(26). Since the vacuum state on the light front is always “bare”, the creation operator, applied 
to the vacuum state {0| gives zero, and in the operators ip{x) the part containing the annihilation 
operators only survives. This cuts out the two-body Fock component in the state vector. We thus 
obtain: 

-P-ut) dr . (52) 

Comparing (51) and (52), we find: 


i;{ki,k2,p,u;T) 


{uj>ki){u;-k2) 
Tt{uJ'P) 7-00 




= ki — ujrf2 -(- cj/S, Z 2 = ^2 “ u)tI2 - uj^,p)dP 


(53) 


where $(Zi, Z 2 ) is the Bethe-Salpeter function parametrized in terms of the off-mass shell momenta 
Zi,Z 2 . The argument p in(53) is related to the on-sheU momenta ki^k 2 as p = ki + k 2 - tor, in 
contrast to off-mass shell relation p = Zi -f Z 2 . 

In ordinary LFD, eq.(53) corresponds to the integration over dk^. This equation makes the 
link between the Bethe-Salpeter function # and the wave function ^ defined on the light front 
specified by uj. It should be noticed however that eq,(53) is not necessarily an exact solution 
of eq.(42), since, as a rule, different approximations are made for the Bethe-Salpeter kernel and 
for the light-front one. In the ladder approximation, for example, the Bethe-Salpeter amplitude 
contains the box diagram, including the time-ordered diagram with two exchanged particles in the 
intermediate state, as indicated in graphically in eq. (76) in sect 6. This contribution is absent in 
the light-front ladder kernel. 

Note also the interesting paper [40], (for earlier studies see [41]), where the Markov- Yukawa 
transversality principle for the two-body Bether-Salpeter kernel was formulated on the covariant 
light-front plane. It allows not only to obtain an exact three-dimensional reduction of the Bethe- 
Salpeter equation, but also to make the exact reconstruction of the four-dimensional Bethe-Salpeter 
equation from the three-dimensional form. The three-dimensional form is convenient for spectro- 
scopical calculations, the four-dimensional form facilitates the evaluation of the loop integrals for 
the form factors. In particular cases the methood gives the same results as obtained earlier by 
other description [42, 43]. A three-quark generalization is given in [44]. 

The quasipotential type equations for the light-front wave function derived by restricting ar- 
guments of the Bethe-Salpeter amplitude to the light-front plane z + t = 0 and corresponding 
electromagnetic form factors were studied in refs. [45, 46]. 


4.6 Solution in the Bethe-Salpeter approach 

The exact expression for the Bethe-Salpeter function in the Wick-Cutkosky model is found in the 
form of the integral repr^entation [47, 28] and, for zero angular momentum, reads: 

,,, t_ g{z,M)dz 


(54) 



814 


V A Karmanov 


The spectral function g{z, M) is determined by a differential equation [47, 28] and has no singularity 
at 2 = 0. The approximate explicit solution found in [47] for g{x^ M) has the form: 

= (55) 


The discontinuity of the spectral function g{z^M) at z = 0 is a result of approximation, since the 
solution (55) corresponds to an asymptotically small binding energy. Inserting (55) in (54) and 
integrating over z, one can recover the solution of the Bethe-Salpeter equation: 






-fol (rr? 


J \ 


(ip- 


A:)^ — iO 


1 -1 


(56) 


where c = 2^y/^rn^ with /c = i/mjeb j = ma/2. 

To find the light-front wave function, one can substitute in eq.(53) the Bethe-Salpeter function 
either in the form (54) or in (56). From (54) we find [48]: 


g(l -2x,M) 
2^\fnx{l — x)(^2 -f 


(57) 


Substituting (55) in (57), we reproduce the expression (49) for the light-front wave function. 


4.7 Including spin 

As explained in sect. 2.3, in the standard version of LFD the generators of the Poincare group core- 
sponding to the Lorentz boasts changing the orientation of the plane t -f- z = 0, are the dynamical 
ones and contain the interaction. In the explicitly covariant version of LFD the dependence of the 
wave function on the light-front orientation is taken into account by means of the variable u. Now, 
using kinematics (i.e., the transformation properties) we have to ensure that this wave function 
corresponds to a definite total angular momentum. In the case of the zero angular momentum the 
four-vector u) enters alway in the scalar product with the particle four-momenta. For the non-zero 
spins u appears in the spin structures. 

We illustrate the construction of the states with spins by two examples. 

Consider a system consisting of quark and antiquark in the = 0” state (”pion”). The 
light-front wave function has the form: 

= U(fc2) 

where u and v are the spinors, o) = Ai ^2 are the scalar functions, m is the quark mass. In 
the system of reference where 4 + = 0 this wave function obtains the form: 

, * / ia-[n X fc] \ , ^ 

I Pi -f — — ip2 j wi , (59) 

with the following relations between the invariant functions: 

4 . 771 V . 

Note that there exists a special representation (see [11]) in which the wave function has the form 
(59) in arbitrary system of reference. 

Rrom eqs.(58,59) one can see that the spin structure of the wave function indeed contains the 
four-vector u determining the light-front orientation. Due to that it is determined by two invariant 
functions. Only one of them (gi) survives in the nonrelativistic limi t. 

Another example is the li^t-front wave function of a system consisting of two fermions in the 
state with total angular momentum equal to 1. This can be two nucleons in the state J' = 1+ 


1 d) 
A 1 - + A 2 — 
m ij-pj 




(58) 



Light-Front Dynamics 


815 


(the deuteron) or the quark-antiquark pair in the state = 1 (p-meson). This wave function 
has the form: 

= Vme^{p)u‘'^{k2)(i>i,Ucu‘''iki) , (60) 

with 


= <pi 


(fcl - fc2)p 
2m2 


1 OJu 

+ ^2—lu. + ^ 3 — + ifi 

m uj-p 


{ki - k2)^u 
2muj-p 


0 'y5^fipp'y^li/k2p^'Y 4' . 

m^up ^ ^ ^ {u)‘pY 


(61) 


It is determined by six invariant functions <^ 1 - 6 ? depending on two scalar variables. This number 
is the dimension of the matrix depending on the spin projections of the deuteron and two nucleons, 
divided by the factor 2 due to the parity conservation: Ar = 3x2x2/2 = 6. 

In the system of reference where fci + ^2 = 0 (or in arbitrary system, but in the representation 
described in [11]) this wave function obtains the form: 


= Vmwl^%l}^{k,n)aywl^ , (62) 

with 

t?(fc,n) = f,^S + h\{^^^-5)+h\{2>n{n-S)-a) 

+ f4-^{8k{n‘a) -f- 8n{ha) — 2{k‘n)a) 

JUiK 

^ •^5y||[Axn] + /6||[[^xn] xo*] , (63) 

where w is the two-component nucleon spinor normalized to = 1. The relations between ip 

and / can be found in [11]. In the relativistic one boson exchange model this wave function was 

calculated in [49]. It was found that the function /s, of relativistic origin, is very important: it 
dominates at k > 500 MeV/c. In nonrelativistic the functions become negligible, and only 
two first structures survive, corresponding to usual S- and D- waves. 

This wave function was used in the paper [50] to calculate the deuteron electromagnetic form 
factors. No any parameters were fitted. It turned out that the calculated structure function 
A{Q^) and the polarization observable tao coincide with rather precise experimental data obtained 
recently at CEBAF/TJNAF. 


4.8 The nucleon wave function 

Many calculations of the nucleon properties (magnetic moments, form factors, etc.) are carried 
out in the framework of LFD with the nucleon wave function in the 8 q model containing one or 
a few spin components. The total number of the spin components in the nucleon wave function 
is sixteen [51]. This is related to the fact known long ago [52] that in a many-body system the 
parity conservation does not reduce the number of the spin components. This is so for a relativistic 
three-body system and for any n-body system forn > 4 (both relativistic and nonrelativistic one). 
Hence, for the relativistic nucleon we get 

N = (25i + 1)(252 + 1)(253 + l)(25iv + I) = 2x2x2x2=16. 

These 16 components are forming the full basis for the nucleon wave function. 

In nonrelativistic limit the parity conservation reduces this number down to 8. Their relativistic 
counterparts were found is [53]. Note, however, that one can construct also another 8 components 
with the opposite parity. 

The difference between relativistic and nonrelativistic cases is related to the fact that in rela- 
tivistic case one can construct the pseudoscalax: 


( 64 ) 



816 


V A Karmanov 


It is not zero, since the bound quarks are off-energy-shell: ki -h k 2 + h — P A ufr p. In ordinary 
light-front approach this corresponds to the well known conservation law: 

All + *21 + ^31 = Pl) ^1+ +■ “h ^3+ = P+1 

but ki^ +* 2 - + fe- ^ P-. Therefore, we can take 8 componets with opposite parity, multiply them 
by Cps and get another 8 componets with the nucleon parity. By this way, we get 16 components 
of the nucleon wave function. They are given in [51]. Due to the momentum conservation, the 
pseudoscalar (64) can be rewritten as: 


Cps = kipk^ukspUJy- 

It is proportional to lj. So, namely the dependence of the relativistic nucleon wave function on the 
light-front orientation u is the reason of appearence of 8 extra componets. In nonrelativistic case 
this dependence disappears, and we remain with 8 components. Formally, this is due to the fact, 
that uj enters in the momentum conservation law in the combination cjr, where r = (s— M^)/(2a;-p). 
This term contains extra factor k/m and disappears at k <^m. We get the nonrelativistic conser- 
vation law: fci -h "f fcs = p and loose opportunity to construct any pseudoscalar and the extra 
components. 

As mentioned above, an advantage of the explicitly covariant LFD is simplification of the 
transformation properties of the wave functions with a given spin. In the standard LFD approach 
the wave function is transformed in every spin index by a special Melosh rotation matrices [54]. In 
the covariant version the transformation properties are automatically taken into account and does 
not require any Melosh matrices. 

Consider, for example, the nucleon wave function in c.m.-system with fully symmetrical S-wave 
spin-isospin structure; 

+ (^12 • ^sn) { ri 2 ■ T 3 n)] (65) 

where (?i 2 = {wlaayW 2 )j azN = {wlaayWjv) and similarly for the isospin matrices fi 2 , rsjv- 
Using the Fierz identities, one can che(i that the wave function (65) is indeed symmetric relative 
to permutations (provived tjjs is symmetric). In arbitrary system it is multiplied by the Melosh 
rotation matrices. For “tps one can take, for example, the harmonic oscillator model: 

2a2 1 ’ 


2^^3/231/4^^ / 

= -3 exp I 


ki are the quark relative momenta, iV is a normalisation factor equal to 1 in the nonrelativistic 
limit. 


In the explicitly covariant LFD it is represented it in covariant, four-dimensional form, in terms 
of the usual Dirac spinors, avoiding any Melosh matrices. For this aim we introduce the projection 
operators: 


n+ = 


M + P 
2M ’ 


n_ = 


M-P 
2M ’ 


where V — ki + k 2 + k 3 — p + wr, P = = V^. M here is the effective mass of the 

free quarks (not the nucleon mass). Then the wave function (65) is covariantly represented as [51]: 


'^s - ;^j;^ciC2C3Civ{3[u(A:i)n+75?7cu(A:2)][u(fc3)n+w(p)] 

- [fi(*i)n+7'‘n_i7ct2(k2)][«(A:3)n+7p75n+u(p)](fi2 -fsjv)}, (66) 

where ci,2,3 = 1/ •\/ci,2,3 + iJi, cjv = l/y/cff + M and, e.g., ei = + jg energy of the 

quark 1. In the system where fci + £2 + ^3 =0 this wave function earacffy coincides with (65). 
The wave function (66) can be decompcsed in terms of the 16 structures discussed above. Other 



Light-Front Dynamics 


817 


states are represented similarly. The calculation of the nucleon properties (magnetic moments, 
electromagnetic form factors, etc.) is now a standard routine using the trace techniques of the 
Dirac matrices. In comparison to the standard light-front approach, for the identical nucleon wave 
functions, the resuts in both approaches coincide with each other, but in the explicitly covariant 
approach they are obtained much more simpler. 


5 Electromagnetic form factors 

The general physical electromagnetic amplitude of a spinless system is given by: 

Jp = (pV.(O) \p) = ip + P%F{Q^) . (67) 

where F(Q^) is the electromagnetic form factor. In LFD it is obtained by calculating the amplitude 
corresponding to fig. 6: 



Fig.6. Electromagnetic vertex of a bound system. 


_ 1 r (p + p' + cur + hit' — 2ki)p 

^ (277)3/ (1 -(u-A:i/<u-p)2 


ijj'tj) 9(w(p- ki)) 


cPki 

2£ifei 


( 68 ) 


Exact light-front amplitude on the energy shell has to coincide with the Feynman one and 
should not depend on the orientation of the light front plane. It should reproduce the form (67). 
However, the diagram 6 corresponds to impulse approximation, when the electromagnetic current 
does not contain any interaction. Therefore the dependence of amplitude (68) on the light-firont 
orientation survives. Jp depends on cu. It also can be represented in the general form: 




(69) 


The factor 1/cu-p is separated for convenience. The invariant functions F and B, depend on 
(Q2 = _g,2 = -p)2. They could depend in principle on cu-p and cu-p'. However, the four- 

vector cu is defined up to an arbitrary number, and, hence, the theory is invariant relatively to the 
replacement oj -> acu, where a is a number. The form factors F and Bi can therefore depend only 
on the ratio cu-p'/w-p. 

Now we tahe into account that cu is restricted by the condition u>-q = 0, implying the transver- 
sality of q. In this case we have w-p'/w-p = 1, and the functions F and B, depend on only. 

The m ai n difference of the amplitude (69) with respect to (67) is the presence of an additional 
contribution, proportional to Up. To avoid any misunderstanding, we emphasize that even in the 
case where the wave function ip does not depend on n, the term proportional to tOp still survives in 
the electromagnetic vertex. 



818 


V A Karmanov 


In the spinless case, the physical form factor, FiQ^) can be obtained immediately by multiplying 
both sides of eq.(69) by Wp. We thus get: 


F{Q 


- 


J'UJ 

2iJ-p 


(70) 


With (68), (70) we obtain: 

f(C=)= (5^ / m 

We have represented here, and in the following, the four-momentum transfer q by q = {qo, -^j^n) 

with A-o; = 0 and q^ is parallel to u. Since u-q = 0, we have = -q^ = A^. 

The form factor in the Bethe-Salpeter approach is found from the formula: 

/ I 1 d^k 

(p + p' -2k)p^i-p-k,p)^(-p' -k,p’){m^ (72) 

The Bethe-Salpeter function ^{k,p) is given by eq. (56). The form factors calculated by means 
of both approach coincide with each other with high accuracy. Both approaches give the same 
asymptotical behavior of the form factors at |^| m^: 


F{t) 


~w~ 





5 


where a = g is the coupling constant in the Wick-Cutkosky model. 

In the usual light-front formulation, with w = (1,0, 0,-1), eq.(70) corresponds to expressing 
the form factor through the J+ component. This is well known, and eq,(71) has been found in ref. 
[55]. However, this procedure cannot be extended to the calculation of physical form factors of 
systems with total spin 1/2 and 1. Their electromagnetc vertices also depend on the four- vector 

U), 

For for a spin-l particle this vertex has the form: 

where J^-' = (73) 

Here is determined by the physical form factors and has the usual structure [56]: 


{XV, m 


{P, [Pi{q^)r +^2(g")|^ 
ef (p')rre^(p), 


+ Gi{q^Mq''-9X)}ei(p) 


(74) 


e^(p) is the spin-l polarization vector, p and j/ are the initial and final momenta, A and A' are 
the corresponding helicities, P = p -f p' and g = p' - p. The tensor contains the cj dependent 
terms: 


= 


4- 

+ 








2i<np) M2 ■ (w.p)2 ■ - 2wp 

(cnpy 2in-p wp 

u-p • 


(75) 


Bi,,..,Bs are invariant functions. This tensor is not eliminated by contraction with Wp. In these 
cases the electromagnetic form factors are given by contraction of the electromagnetic vertex with 
mme complicated tensors found in [57, 58]. The current component J+ is still enough to find the 
fcmn factors Jj, but it is not enough to find Qi. 

The fmmulas for the physical form factors for the case of spin- 1/2 light-front electromagnetic 
vatex (nucleon electromagnetic form factors, for instance) are found in [59]. 



Light-Front Dynamics 


819 



Fig. 7 . LF time-ordered boxes for a scattering angle of 7 r /2 as a function of the incoming momentum 
V. We also give the ratios of boxes with at least four particles (7^^+ and 7 ^ 4 + ) or five particles 
(7^1+ , TZgf = 0 ) in one of the intermediate states. 

6 Suppression of the higher Fock states 

The kernel corresponding to exchange by a particle in the Bethe-Salpeter approach and in LFD 
are not equivalent to eaoh other. The light-front graphs are obtained firom the Feynman ones by 
time-ordering of the vertices. For example, the Feynman graph with two exchanges corresponds 
to the following sum of the time ordered graphs: 



(76) 


The last two graphs in (76) containing two exchanged particles in the intermediate state (”the 
stretched box”) are omitted in the second iteration of the light-front kernel. The number of graphs 
with increasing number of intermediate particles increases in higher orders. At small value of 
the coupling constant a their contribution can be suppressed, but at a « 1 this reason of the 
suppression disappears. However, these higher Fock state graphs are still suppressed. 

In the papers [60, 61] the binding energy was calculated in the framework of the Bethe-Salpeter 
equation and the light-firont one. It was found that even at a « 1 the binding energies calculated 
in both approaches are very close to each other. This indicates that the contribution of the higher 
Fock states is suppressed. 

This contribution has been calculated directly in the papers [62, 63]. The result is shown in 
fig. 7. In these figures v means the incoming momentum. One can see that the contribution of the 
stretched box into the sum of time ordered graphs is neligible. Its relative contribution 7^^+ is of 
the order a few per cent. 

Another important conclusion which follows from fig. 7 is that the suppression of the higher 
Fock states takes place namely in LFD. In the instant form of dynamics these contributions much 
more larger. For four or more intermediate particles, due to the fluctuations, they are indicated 
in fig. 7 as 7 ^. 4 + . The corresponding graphs are shown in fig. 8 . For five or more intermediate 
particles, due to a few vacuum vertices, they are indicated as 'R }^ . 

These results show that the light-front contributions of higher Fock states are significantly 
smaller than in the instant form. In the limit u 0 the ratio goes to zero, because the phase 
space becomes empty. However, in the instant form there is a finite contribution of 7 ^ 4 ^ = 4.5% 
in this limit. 



820 


V A Karmanov 



Fig. 8 . Time-ordered diagrams that contribute to 7^5. The diagrams in the first column have five 
particles in the first intermediate state. The diagrams in the second column have five particles in 
the last intermediate state^ and the diagrams on the right have five-particle intermediate states for 
both the first and the third intermediate state. 

7 Conclusion 

We have described the general construction of LFD, its explicitly covariant formulation and some 
applications to the field theory and to the relativistic wave functions. These developments have 
been made particularly simple, and intuitive, by the three-dimensional nature of formalism, inter- 
pretation of amplitudes in terms of the space-time picture and the absence of vacuum fluctuations. 
We have shown also the relation to other approaches, in particular, to the Bethe-Salpeter one. 

Though the light-front amplitudes can be derived from the Feynman ones, and the light-front 
wave function can be obtained by the projection of the Bethe-Salpeter amplitude on the light-front 
plane, this does not mean that LFD is only a method to calculate the Feynman amplitudes and to 
find an approximate eigenvalue of the Bethe-Salpeter equation. 

The light-front approach has much more general and independent meaning. It is an alternative 
and rather powerful way to solve the field-theoretical problems. 

8 Appendix 

8.1 Kinematical transformations 

We specify here the transformation properties of the state vector with respect to transformations 
of the coordinate system. 

The operators associated to the four-momentum and four-dimensional angular momentum are 
expr^sed in terms of integrals of the energy-momentum and the angular momentum 
tensors over the light-front plane w-x = cr, according to: 

p^ = j - (x)ct^x = P^ + , ( 77 ) 

= J M^^ujp6{wx - a)d^x = J^^ + , (78) 

where the 0 and int superscripts indicate the free and interacting parts of the operators respectively. 
For generality, we consider here the light-front time cr 0. 

The description of the evolution along the light-front time a implies a fixed value of the length 
of w, or, equivalently, of cjq. This is necessary in order to have a scale of cr. However, the most 
important properties of the physical amplitudes following from covariance do not require to fix the 
scale of 0 / and will be invariant relative to its change. We work in the interaction representation 
in which the operators are expressed in terms of the free fields. Consider, for example, the scalar 
fidd <p(x), eq.(8). Then the free operators have the form: 

^ ~ j ’ (79) 


( 80 ) 



Light-Front Dynamics 


821 


The operators and contain the interaction Hamiltonian 

= uj^jH'^^{x)6(cj-x-<T)d:^x, (81) 

= J H''^^{x){x^ijJv — XvijJp)5{oj-x — a) d^x . (82) 

The field-theoretical Hamiltonian is usually singular and requires a regularization. The 

regularization of amplitudes will be illustrated above in sect. 3.3 by the example of a typical 
self-energy contribution. 

In the particular case u = (1,0,0, -1), in the light-firont coordinates, only a;_-component is 
non-zero. This just gives that in (80,82) the components are non-zero, i.e., corresponding 

generators in (77,78) contains the interaction. 

Under translation x x* — x-\- a of the coordinate system A A', the equation u-x = a takes 
the form u-x' = a' , where a' = a- -h u-a. The state vector is transformed as: 

<l>U^)-^4>L{cr') = Upo{a)<l>Ucr) , (83) 

where the operator U po (a) contains only the operator of the four-momentum (79) of the firee field: 

Upo{a) = exp{iP^-a) . (84) 

The “prime” at (f)'{a) indicates that (f)'{a) is defined in the system A' on the plane u-x' = cr in 
contrast to <f>{a) defined in the system A on the plane u-x = a (the value of a being the same). The 
state vector <j>'{cr') is defined in A' on the plane cj-a:' = cr\ which coincides with u-x = a. Therefore 
no dynamics enters into the transformation (83). This is rather natural, since under translation 
of the coordinate system the plane u-x = a occupies the same position in space while it occupies 
a new position with respect to the axes of the new coordinate system, as indicated in fig. 9. The 
formal proof of (83), (84) can be found in [64]. 



Fig. 9. Translation of the reference system along the light-front time. 

In the case of infinitesimal four-dimensional rotations Xfj, x'^ = gx^j, ^ Xfj, A eufjtx'', the result 
is similar [64]: 

^ , (85) 

where u'^ = u^ -f and 

UAg) = l + l4.€'^r ( 86 ) 

The operator ^ shows that the transformations of the state vector with 

respect to the transformations of the coordinate system are indeed kinematical. 



822 


V A Kaxmanov 


8.2 Dynamical transformations 

The properties of the state vector under transformations of the hypersurface are determined by 
the dynamics and follow from the Tomonaga-Schwinger equation [65]. 

iSdjSaix) = H^^Hx) cf> . (87) 

From the definition of the variational derivative in (87) we obtain: 


iS(t> = H^^\x) 4> SV{x) , 

where 5 ^( 2 ) is the volume between the initial surface and the surface obtained from the original 
one by the variation <5a(x) around the point x. 

Under the translation a -> a* + of the plane, the total increment of the state vector is 
obtained through the increment at each point of the surface: 

iS(j) = j W'^^{x)5{uj'X - <j)d^x (f> Scr , ( 88 ) 

This relation gives the Schrodinger equation. In the interaction representation in the light-front 
time, we have: 

id^jda = H{(j)<f>{cr) , (89) 


where: 

Hia) = f Ht^\x)5{u;^x -- a)d^x, (90) 

and di®sr firom because of singularities of the field commutators on the light 

cone. This point is explained below in the section 3. 

Similarly, in the case of a rotation of the light-front plane, o;^ = -b 
we find: 






(91) 


The increment of the volume over the point x is: 

JF = 5{uj'X - a) dt^x , (92) 

and it follows from (88) that [64]: 

= V{a;)0,(o), (93) 

where: 

= , (94) 

and is given by (82). 

Equation (93) is called the angular condition. It plays an important role in the construction of 
relativistic bound states. 

The transformation of the coordinate system and the simultaneous transformation of the light- 
front plane, which is rigidly related to the coordinate axes, correspond to the successive application 
of the two types of transformations considered above (kinematical and dynamical). Thus, under 
the infinitesimal translation x -4 x' = x + a of the coordinate system, A A\ and of the plane, 
we have: 

<l>cj{cr) <j>l{a) = (1 + iP^a)(t>uj{a) . (95) 

Note that for the state with definite total four-momentum p (i.e., for an eigenstate of the four- 
momentum operator), the equations (83) and (95) give: 

exp{iP^-a)0(a) = exp(ip*a)<?!>(o- 4- a;*a) . (96) 

This equation determines the conservation law (27) for the four-momenta of the constituents. 



Light-Front Dynamics 


823 


References 

[1] P.A.M. Dirac, Rev. Mod. Rhys. 21 (1949) 392. 

[2] J. Kogut and L. Suskind, Phys. Reports, 8 (1973) 75. 

[3] H. Leutwyler and J. Stern, Ann. Phys. (N.Y.) 112 (1978) 94. 

[4] F.M. Lev, Fortschr. Phys. 31 (1983) 75. 

[5] Hadronic Physics with Multi-GeV Electrons, eds. B. Desplanques and D. Goutte, Nova Science, 
Commack, New York, 1990; 

[6] B.D. Keister and W.N. Polyzou, In: Advances in Nuclear Physics, ed. J.W. Negele and E.W. 
Vogt, (Plenum Press, New York) 20 (1991) 225. 

[7] F. Coester, Prog, in Part, and Nucl. Phys., 29 (1992) 1. 

[8] V.R. Garsevanishvili and Z.R, Menteshashvili, "Relativistic Nuclear Physics in the Light Front 
Formalism”, Nova Science Publishers, New York, 1993. 

[9] B.D. Keister, AIP Conf. Proc. 334, Few-Body Problems in Physics, p.l64, Williamsburg, May 
1994, ed. F. Gross, AIP press. New York. 

[10] "Theory of Hadrons and LFQCD”, Fourth International Workshop on Light-Front Quantiza- 
tion and Non-Perturbative Dynamics, August 1994, ed. St. Giazek, World Scientific, 1995. 

[11] J. Carbonell, B. Desplanques, V.A. Karmanov and J.-F. Mathiot, Phys. Reports, 300 (1998) 
215. 

[12] F.M, Lev, Nucl. Phys. A606 (1996) 459. 

[13] New Non-Perturbative Methods and Quantization on the Light Cone, Les Houches School, 
Feb. 24 - March 7, 1997, v. 8, ed. by P. Grange et al., Springer-EDP Sciences, 1998. 

[14] St. Giazek, Acta Phys. Polon., B29 (1998) 3558. 

[15] M.M. Brisudova, R.J. Perry and K.G. Wilson, Phys. Rev. Lett., 78 (1997) 1227. 

[16] S.J. Brodsky, R. Roskies and R. Suaya, Phys. Rev. D8 (1973) 4574. 

[17] G.P. Lepage and S.J. Brodsky, Phys. Rev, D22 (1980) 2157. 

[18] V.A. Karmanov, ZhETF, 71 (1976) 399 [transl.: JETP, 44 (1976) 210]. 

[19] F. Coester, W.H. Klink and W.N. Polyzou, Few-Body Syst. Suppl. 10 (1999) 115. 

[20] V.G. Kadyshevsky, ZhETF, 46 (1964) 654, 872 [JETP, 19 (1964) 443, 597]. 

[21] V.G. Kadyshevsky, Nucl. Phys. B6 (1968) 125. 

[22] V.G. Kadyshevsky, R.M. Mir-Kasimov and N.B. Skachkov, Fiz. Elem. Chastits At. Yadra, 2 
(1972) 635 [Sov. J. Part. Nucl. 2 (1972) 69]. 

[23] V.A. Karmanov, Fiz. Elem. Chastits At. Yadra, 19 (1988) 525 [Sov. J. Part. Nucl 19 (1988) 
228]. 

[24] N.E. Ligterink and B.L.G. Bakker, Phys. Rev. D52 (1995) 5954. 

[25] N.E. Ligterink and B.L.G. Bakker, Phys. Rev. D52 (1995) 5917. 

[26] S. Weinberg, Phys. Rev. 150 (1966) 1313. 



824 


V A Karmanov 


[27] E.E. Salpeter and H.A. Bethe, Phys. Rev. 84 (1951) 1232. 

[28] N. Nakanishi, Prog. Theor. Phys. Suppl. 43 (1969) 1; 95 (1988) 1. 

[29] C. Dietmayer et al., Z. Phys. A334 (1989) 215, 

[30] Th. Heinzl, St. Krusche and E. Werner, Z. Phys. A334 (1989) 443. 

[31] K.G. Wilson et al., Phys. Rev. D49 (1994) 6720;. 

[32] M.G. Puda, Phys. Rev. D41 (1990) 534; D42 (1990) 2898; D44 (1991) 1880; Ann. Phys. 
(N.Y.) 197 (1990) 265; 231 (1994) 1; Nucl. Phys. A543 (1992) 111c. 

[33] J.M. Namyslowski, Phys. Rev. D18 (1978) 3676. 

[34] P. Danielewicz and J.M.Namyslowski, Phys. Lett. B81 (1979) 110. 

[35] P.M. Fishbane and J.M. Namyslowski, Phys. Rev. D21 (1980) 2406. 

[36] J.M. Namyslowski and H.J. Weber, Z. Phys. A295 (1980) 219. 

[37] M. Sawicki, Phys. Rev. D32 (1985) 2666; D33 (1986) 1103. 

[38] V.A. Karmanov, Nucl. Phys. B166 (1980) 378. 

[39] G. Feldman, T. Fulton and J. Townsend, Phys. Rev. D7 (1973) 1814. 

[40] A.N. Mitra, hep-ph/9812404; Phys.Lett.B463, 293 (1999). 

[41] A.N. Mitra and S. Bhatnagar, Int. J. Mod. Ph}^. A7 (1992) 121. 

[42] A.N. Mitra et al., Phys. Rev. Lett., 59 (1987) 2408; Phys. Rev. D38 (1988) 1454. 

[43] S. Chakrabarty et al.. Prog. Part. Nucl. Phys. 22 (1989) 143. 

[44] A.N. Mitra, hep-th/9803062; Intl.J.Mod.Phys.A14, 4781 (1999). 

[45] V.R. Garsevanishvili, A.N. Kvinikhidze, V.A. Matveev, A.N. Tavkhelidze and R.M. Faustov, 
Teor. Mat. Fiz. 23 (1975) 310. 

[46] V.R. Garsevanishvili and V.A. Matveev, Teor. Mat. Fiz. 24 (1975) 3. 

[47] G.C. Wick Phys. Rev. 96 (1954) 1124; 

R.E. Cutkosky, Phys. Rev. 96 (1954) 1135. 

[48] S.J. Brodsky, C.-R. Ji and M. Sawicki, Phys. Rev. D32 (1985) 1530. 

[49] J. Carbonell and V.A. Karmanov, Nucl. Phys. A581 (1995) 625. 

[50] J. Carbonell and V.A. Karmanov, hep-ph/9812404; to appear in Euro. J. Phys. A. 

[51] V.A. Karmanov, Nucl. Phys., A644 (1998) 165. 

[52] V.M. Kolybasov, Nucl. Phys. 68 (1965) 8. 

[53] Z. Dziembowsky, Phys. Rev. D37 (1988) 768. 

[54] H.J. Melosh, Ph 3 rs. Rev. D9 (1974) 1095. 

[55] G.F. Gunion, S.J. Brodsky and R. Blankenbecler, Phys. Lett. B39 (1972) 649: Phvs Rev 
D8 (1973) 287. 

[^] V. Glaser and B. Jaksic, Nuovo Cimento, 5 (1957) 1197. 

[57] V.A. Karmanov and A.V. Smirnov, Nucl. Phys. A546 (1992) 691. 



Light-Front Dynamics 


825 


[58] V.A. Karmanov and A.V. Smirnov, Nud. Phys. A575 (1994) 520. 

[59] V.A. Karmanov and J.-F. Mathiot, Nud. Phys. A602 (1996) 388. 

[60] M. Mangin-Brinet and J. Carbonell, ''Solution numerique du modele de Wick-Cutkosky dans 
le cadre de la Light Front Dynamics", Rapport de Stage ISN/ECP (1997). 

[61] T. Prederico, private communication. 

[62] N.C.J. Schoonderwoerd and B.L.G. Bakker, Few-Body Syst. Suppl. 10 (1999) 119. 

[63] N.C.J. Schoonderwoerd, B.L.G. Bakker and V.A. Karmanov, Phys.Rev. C58 (1998) 3093-3108 

[64] V.A. Karmanov, ZhETF, 83 (1982) 3 [JETP, 56 (1982) 1]. 

[65] Quantum Electrodynamics, selected papers, Dover Publ. Inc., New York, 1958, ed. J. 
Schwinger. 



31. 3D-4D Interlinkage Of B-S Amplitudes: 
Unified View Of QQ And QQQ Dynamics 

A.N.Mitra * 

244 Tagore Park, Delhi- 110009, India 


Abstract 

This article has a 3-fold objective: i) to provide a panoramic view of several types of 3D 
vs 4D approaches in Field Theory (Tamm-Dancoff, Bethe Salpeter Equation (BSE), Quasi- 
potentiaJs, Light-Front Dynamics, etc) for strong interaction dunamics; ii) to focus on the 
role of the Markov- Yukawa Transversality Principle (MYTP) as a novel paradigm for an exact 
3D-4D interlinkage between the corresponding BSE amplitudes; hi) Stress on a closely parallel 
treatment of qq and qqq BSE’s stemming from a common 4-fermion Lagiangian mediated by 
gluon (vector)-iike exchange. The two-way interlinkage offered by MYTP between the 3D and 
4D BSE forms via a Lorentz-covariant 3D support to the BS kernel, gives it a unique status 
which distinguishes it from most other 3D approaches to strong interaction dynamics, which 
give at most a one-way connection. Two specific types of MYTP which provide 3D support 
to the BSE kernel, are considered: a) Covariant Instantaneity Ansatz (CIA); b) Covariant 
LF/NP ansatz (Cov.LF). Both lead to formaly identical 3D BSE reductions (thus ensuring 
common spectral predictions), but their 4D manifestations differ sharply: Under CIA, the 
4D loop mtegrals suffer from Lorentz mismatch of the vertex functions, leading to ill-defined 
time-like momentum integrals, but Cov LP is free from this disease. Some practical uses of 
MYTP as a basis for evaluating various types of 4D loop integrals axe outlined. 

PACS: 11.10 St ; 12.38 Lg ; 13.40.Fn 

Ke 5 nvords: Tamm-Dancoff, Bethe-Salpeter, Quasi-potentials, Light-front (LF), Markov- Yukawa, 
3D-4D Interlinkage, CIA, Gov-LF, Spectroscopy. 


1 Introduction: Effective BSE-SDE Framework 

Ever since the success of the Tomonaga-Schwinger-Feynman-Dyson formalism in QED [1], cor- 
responding field-theoretic formulations have been in the forefront of strong interaction dynamics 
since the early fiffci^, the main strategy being to device various ‘closed’ form approaches which 
are represented as appropriate ‘integral equations’. One of the earliest efforts in this direction was 
the Tamm-Dancoff formalism [2] which showed a great intuitive appeal. In this method, the state 
vector of the system under consideration is Fock-expanded in terms of a complete set of eigen- 
functions of the free field Hamiltonian, so that the expansion coefficients are the successive ’’am- 
plitudes for finding present in the field a definite number of bare particles with definite spins and 
momenta....” [3]. This method was first systematically applied by Dyson {+ Cornell collaborators) 
in the early fifti^, to the meson-nucleon scattering problem, for a dynamical understanding of the 
‘Delta* and other loW-energy resonances; (see ref. [3]), It leads to 3D integral equations connecting 
amplitudes for succ^ively higher numbers of meson (and/or nucleon-pair) quanta, much as the 
familiar 4D Schwinger-Dyson equations of QED connect (via Ward identities) vertex amplitudes 
of successively higher orders [4], 


•Email: (l)ganmitra©nde. vsnl.net. in; (2)anmitra@csec.ernet.in 



3D-4D B-S Amplitudes 


827 


1.1 3D Reduction of BSE: Quasipotentials, Etc 

The 3D Tamm-Dancoff equation (TDE) [3] and the 4D Schwinger-Dyson equation (SDE) [4] have 
been the source of much wisdom underlying the formulation of many approaches to strong in- 
teraction dynamics. To these one should add the Bethe-Salpeter equation (BSE) [5], which is 
an approximation to SDE for the dynamics of a 4D two-particle amplitude, characterized by an 
effective (gluon-exchange-like) pairwise interaction, on the lines of a ’’Bethe Second Theory” of 
the Fifties for the effective N-N interaction, but now^ adapted to the quark level. Although not a 
fundamental (first principle) approach, such as the Chiral Schwinger Theory (CST) of (1 + 1)D, 
[see the Article by P.P.Srivastava in this Book], it has attracted more attention in the contempo- 
rary literature (as the 4D counterpart of the Schroedinger equation) than any other comparable 
approach. Perhaps it is a fair assessment that there is a degree of complementarity between first 
principle (emphasizing theoretical foundation) approaches like CST and second principle (stressing 
applicational aspects) approaches like BSE, in the sense that the lacunae of one are made up by 
the other, so that both together hold the key to a resolution of the strong coupling problem. In 
this Article we are concerned with the latter only. 

A major bottleneck for the BSE approach has been its resistance to a probability interpretation, 
due to its 4D content. This has led to many attempts at its 3D reductions [6-9]: Instantaneous 
approximation [6]; Quasi- potential approaches [7,8]; variants of on-shellness of the associated 
propagators [9]. In [7,9], the starting BSE is 4D in all details, including its kernel, but the associated 
propagators are manipulated in various ways to reduce the 4D BSE to a 3D form as a fresh starting 
point of the dynamics; in [8], the old-fashioned 3D perturbation theory is reformulated covariant ly 
to give a 3D quasipotential equation. These methods are briefly sketched in Sect 2. 

At the 4D level, the BSE [5] is still the most widely used form of 2-particle dynamics, though 
the problems of probabilistic interpretation were the chief reason for these 3D formulations. Nev- 
ertheless the regular 4D equations of the full-fledged SDE-BSE types have been widely employed 
[10], as prototypes of strong interaction dynamics, addressing issues of gauge and chiral symme- 
tries, as well as dynamical breaking of chiral symmetry via an NJL-type mechanism [11]. In such 
full-fledged field-theoretic approaches, the NJL- mechanism of contact interaction must of course 
be replaced by space-time extended interaction, and the dynamical breaking of chiral symmetry 
{DBxS) corresponds to the use of SDE for the self-energy operator [12]. As a general remark, 
while for conceptual issues impinging on formulational self-consistency, there is little alternative to 
the full-fledged 4D equations, their applications to physical systems must recognize some ground 
realities. For example, the mass spectra of hadrons (which are revealed in Nature as 0(3)-like [13]), 
suggest that the role of the time-like dimension (although an integral part of the dynamics) is not 
on the same footing as that of the space-like dimensions, so that a naive expection of 0(4)-like 
spectra [14] may be quite misleading. Indeed this issue is quite central to the very theme of this 
article, viz., 3D-4D interlinkage of BS-amplitudes, and will claim attention throughout. 

An alternative form of 2-particle dynamics (which also contributes to reducing the effective 
degrees of freedom from 4D to 3D) is that of Dirac constrained Hamiltonian formalism [15], de- 
veloped by Komar and others [16]. The logic of this approach is that constraints Hi have a twin 
role, viz., they not only ‘constrain the motion’ in phase space, but also generate it in their ‘Hamil- 
tonian’ capacity. These ‘constraints’ must be mutually compatible in the sense [Hi,Hj] = 0. Such 
compatibility relations restrict the dependence of the interaction on the relative time t, and require 
a ‘reciprocity relation’ between the constituent potentials, something akin to Newton’s III Law. 
Such descriptions are valid for both two-boson and two-fermion systems [16], in the sense of cou- 
pled Klein-Gordon and Dirac equations respectively. As this formalism is reviewed in detail in this 
Book by Lusanna [17a], it will not be discussed further here. In the same spirit, more fundamental 
approaches like the Chiral Schwinger Theory (CST) in (1 -f 1)D, may be found in [17b]. 

1.2 Light Froiit(LF)/ Null Plane(NP) Dynamics 

A powerful form of 3D dynamics came into prominence after Weinberg’s discovery that the dy- 
namics of the infinite momentum frame [18] serves as a cure for many ills in the theory of current 



828 


A N Mitra 


algebras, by greatly simplifying the rules of calculation of Feynman diagrams of old fashioned 
perturbation theory. In the present context of strong interaction dynamics, the great virtue of 
Weinberg’s infinite momentum method [18] lies in the simplicity and transparency of the inte- 
gral equations for multiparticle potential scattering problems [19]. Indeed, the structure of the 
3 -momenta (px,Pii) appearing in this formalism is but a paraphrase for the standard null-plane 
variables first introduced by Dirac to project his theme [20] that a relativistically invariant Hamil- 
tonian theory can be based on 3 different classes of initial surfaces (space-like, time-like, and 
null-like). The structure of such a Hamiltonian theory is strongly dependent on these respective 
surface forms whose ’’stability groups” (i.e., those generators of the Poincare group that leave the 
initial surface invariant), axe 6 , 6 , 7 respectively, thus giving the ‘highest score’ (7) to the null-plane 
dynamics (xq = xs) whose ‘kinematical’ generators form a closed algebra, and include among oth- 
ers the quantity = Pq + Pz (which plays the role of the ‘mass’ term 77 in the Weinberg notation 
[18]). On the other hand, the dual generator P^ = Pq - Pz is the ‘Hamiltonian’ of the theory. 

Leutwyler and Stern [21] gave a covariant 3D formulation of the Dirac theme [20] in terms 
of null-plane variables. A more explicit covariant formmulation in the null-plane language was 
given by Karmanov [22a] using diagrammatic techniques with on-shell propagators and spurions, 
on the lines of the Kadychevsky approach [ 8 ], which has been recently reviewed by Carbonell et 
al [ 22 b], (referred to as KK). Ail these methods, including the Wilson group’s [23], give rise to 3D 
integral equations for a (strongly interacting) two-body system, bearing strong resemblance to the 
other 3D BSE forms [6-9] above. Again, there is no getting back to the original 4D BSE form, 
the nearest connection being a one-way reduction fi:om 4D BSE to 3D on the covariant null-plane 
[ 22 b]. 


1.3 Markov- Yukawa Transversality: 3D-4D Interlinkage 

Finally we come to a rather novel approach of more recent origin [24-25], based on the Markov- 
Yukawa Transversality Principle (MYTP) [26]. To motivate this approach, it is necessary to go back 
in time to Yukawa’s non-local field theory [26b] according to which the field variable is a function 
of both X and p, unlike in local field theory in which the field variable is only a (local) function of x. 
Although unacceptable for an elementary particle/field, the non-local field theory is ideally suited 
to a composite particle, whose extended structure effectively provides for a momentum dependence 
in the direction of the total 4-momentum F^. Indeed the Yukawa theory [26b] was in a way 
the forerunner of a later theory of bi-local fields M (x, y) [27] for the formulation of the Effective 
Action for a 2-body dynamical system [27]. This approach was employed by the Prevushin group 
[25] in their formulation of the relativistic Coulomb problem in the Salpeter approximation [ 6 b] 
in a covariant form, with the choice of the preferred direction governed by the 4 -momentum F^ of 
the composite as the canonical conjugate to its c.m. position X = (x -h y)/2: P = —idx- More 
specifically the MYTP [26] is expressed by the condition [25]: 


d 


A4(z,X) = 0; z^x — y 


( 1 ) 


where the direction guarantees an irreducible representation of the Poincare’ group for the 
bilocal field M [26c]. This condition in turn is equivadent to a covariant 3 D support to the input 
4-quark Lagrangian, whence follow the SDE and BSE as equations of motion with 3D support to 
the effective BSE kernel under covariant instaneity. 

Alternatively, the 3 D support ansatz may be directly postulated at the.outset for the pairwise 
BSE kernel K [24] by demanding that it be a function of only Qit — Q — q-PP^JP^, which implies 
tiiat q.P = 0. In this approach, the propagators are left untouched in their full 4 D forms. This 
is somewhat complementary to the approaches [6-9] (propagators manipulated but kernel left 
untouched), so that the resulting equations [24-25] look rather unfamiliar vis-a-vis 3 D BSE’s [6-9], 
but it has the advantage of allowing a simultaneous use of both 3D and 4D BSE forms via their 
interlinka^. Indeed what distinguishes the Covariant Instantaneity Ansatz (CIA) [24] from the 
nwre familimr 3D reductions of the BSE [6-9] is its capacity for a 2-way linkage: an exact 3 D 
BSE reduction, and an equally exact reconstruction of the original 4 D BSE form without extra 



3D-4D B-S Amplitudes 


829 


charge [24]: the former to access the observed 0(3)-like spectra [13], and the latter to give transition 
amplitudes as 4D quark loop integrals [24]. (In the approach of the Pervushin Group [25], however, 
the built-in 3D-4D interconnection which follows from MYTP [26], apparently remained unnoticed 
in their final equation). In contrast the more familiar methods of 3D give at most a one-way 
connection, viz., a 4D 3D reduction [6-9], but not vice versa. 

At this point it is perhaps worth noting that even the Salpeter equation [6b] had (in principle) 
the ingredients for a reconstruction of the 4D BS amplitude ^ in terms of 3D ingredients, provided 
the ‘instant’ form, (see eq.(2.1) in Sect 2), of the interaction kernel had been employed on the 
RHS of the 4D BSE form, and simultaneously the 4D BS amplitude ^ on the RHS had been 
eliminated in favour of the 3D BS amplitude exactly as was done under CIA [24]. This would 
have amounted to using the Transversality Principle [26] (albeit non-covariantly), but this feature 
had apparently remained unnoticed by subsequent workers who continued to employ the Salpeter 
equation [6b] in its 3D form only. 

1.4 QCD-motivated Effective Lagrangians 

The Transversality Principle (MYTP) [26] underlying the 3D-4D interconnection [24], termed 
3D-4D-BSE in the following, of course needs supplementing by physical ingredients to govern 
the structure of the BSE kernel, much as a Hamiltonian needs a properly defined ‘potential’. 
However its canvas is broad enough to accommodate a wide variety of kernels which must in 
turn be governed by independent physical principles. In this respect, short of a full-fledged QCD 
Lagrangian approach, the orthodox view (which we adopt) is to stick to an effective 4-fermion 
Lagrangian as a starting point of the dynamics, from which the successive equations of motion 
(SDE, BSE, etc) follow in the standard manner. [As already noted at the outset, this is in keeping 
with the Bethe Second Principle Theory for effective N — N potentials as an input for the physics 
of the nuclear many-body problem]. 

In particular, a basic proximity to QCD is ensured through a vector- type interaction [12], 
which while maintaining the correct one-gluon-exchange structure in the perturbative region, may 
be fine-tuned to give any desired structure to the intermediate gluon propagator in the infrared 
domain as well. Although empirical, it captures a good deal of physics in the non-perturbative 
domain while retaining a broad QCD orientation, albeit short of a full-fledged QCD formulation. 
More importantly, the non-trivial solution of the SDE corresponding to this generalized gluon 
propagator [12] gives rise to a dynamical mass function m(p) [12] as a result of DBxS^ w.r.t. an 
input Lagrangian whose chiral invariance stems from a vector-type 4-fermion interaction between 
almost massless u — d quarks. These considerations form the standard basis for a Lagrangian- 
based BSE-SDE framework [10] for Dynamical Breaking of Chiral Symmetry (DBxS) [11], for 
a space-time extended 4-quark Lagrangian mediated by vector exchange [12]. This generates a 
mass- function m(p) via Schwinger-Dyson equation (SDE), which accounts for the bulk of the 
constituent mass of ud quarks. The same BSE-SDE formalism [12,10] can be simply adapted [28] 
to the MYTP [26]-based 3D-4D-BSE formalism [24] which reproduces 3D spectra of both hadron 
types [29] under a common parametrization [28] for the gluon propagator, with a self-consistent 
SDE determination [28] of the constituent mass; see Sect 3. 

A BSE-SDE formulation [10] on QCD lines represents a 4D field- theoretic generalization of 
‘potential models’[30], wherein the generalized 4-fermion kernel [12] represents the non-perturbative 
gluon propagator, which can be easily adapted [28] to MYTP [26]). The 4D feature of BSE-SDE 
gives this framework a ready access to high energy amplitudes, while its ‘off-shell’ features give it 
a natural access to hadronic spectra [13]. It has thus an interpolating role between (low energy) 
quarkonia models [30], and (high energy) QGD-SR [31] techniques whose domains are largely 
complementary; details may be found in a recent review [32]. 

1.5 MYTP: Cov Instantaneity vs Cov LF/NP 

While MYTP [26] ensures 3D-4D interconnection [24] under covariant instantaneity ansatz (CIA) 
in the composite’s rest frame [24], its main disadvantage lies in the ill-defined nature of 4D loop 



830 


A N Mitra 


integrals which acquire time-like momentum components in the exponential/gaussian factors as- 
sociated with the different vertex functions, due to a ‘Lorentz-mismatch’ among the rest-frames 
of the participating hadronic composites. This problem is especially serious for triangle loops and 
above, such as the pion form factor, w’hile 2-quark loops [33] just escape this pathology. This 
problem was not explicitly encountered in the light-front (LF/NP) ansatz [34] in an earlier study 
of 4D triangle loop integrals, but this approach was criticized [35] on grounds of non-covariance. 
The CIA approach [24] which made use of MYTP [26], was an attempt to rectify the Lorentz 
covariance defect, but the presence of time-like components in the gaussian factors inside triangle 
loop integrals, e.g., in the pion form factor [36], impeded further progress. 

In an attempt to remedy this situation, a generalization of MYTP [26] was proposed recently 
[37] to ensure formal covariance without having to encounter tune-like components in the gaussian 
wave functions appearing inside the 4D loop integrals. The desired generalization was achieved by 
extending the Transversality Principle [26] from the covariant rest frame of the (hadron) composite 
[24], to a covariantly defined light-front [37] (Cov LF). It was found that while preserving the 3D-4D 
BSE interconnection, the resulting 3D equation under Cov LF [37] turns out to be formally identical 
to the old-fashioned null-plane formalism [34,38], so that the latter enjoys ipso facto covariance 
(d^pite its looks’). This ‘covariant’ LF/NP method [37] stands fairly direct comparison with 
other covariant LF approaches [22-23]. 

1.6 Scope of the Article 

This article has a 3-fold objective: A) a bird’s eye view of some principal 3D vs 4D dynamical 
methods for the strong interaction problem that have been proposed over the last half century; B) 
Putting in perspective a novel property of the Markov- Yukawa Transversality Principle (MYTP), 
viz., a 2-way 3D-4D interconnection in the BS dynamics of 2- and 3-quark hadrons; C) Stressing 
a close parallelism between qq and qqq BSE’s which stem from a common 4-fermion Lagrangian 
mediated by gluon (vector)-like exchange. Especially noteworthy is the capacity of MYTP [26] to 
achieve a 3D-3D interlinkage, a property which has remained obscured from view in the contem- 
porary literature, vis-a-vis more familiar approaches to BSE and allied forms of dynamics [6-10, 
18-23] which are either 3D or 4D in content, but have no provision for any interlinkage between 
these two dimensions. While the details of individual MYTP applications (several of which are 
dealt with in [32]), are not a part of this Article, the practical uses of MYTP in the strong interac- 
tion dynamics regime will nevertheless be a focus of attention by virtue of its distinct advantage in 
the evaluation of 4D loop integrals with arbitrary vertex functions, while providing easy access to 
the spectroscopy sector. To that end, an outline of the dynamical structure of some principal 3D 
methods, [7-9], [18-23], is provided in Sect.2 as a background for comparison on 4D loop integral 
techniques, while on the issue of hadron spectra, which are basically 0(3)-like [13], a comparison 
between non-MYTP [7-9] and MYTP [24-26] forms of dynamics does not bring out new physics. 

For a better understanding of the working of MYTP, it will be necessary to present two types 
in parallel for comparison, viz., Covariant Instantaneity or CIA [24-25], and Covariant Light-front 
(Cov LF) [37], whidi de m and that the BSE kernel K for pairwise interaction be a function of 
relative momentum q transverse to the compc^ite 4-momentum in the first case [24], and to the 
Covariant Null-plane in the second [37]. It will be shown that both types lead to identical 3D BSE 
reductions (so that their spectral predictions are formally the same), but their reconstructed 4D 
vertex functions reveal profound differences in structure: The Lorentz mismatch of individual wave 
functions that diaracterizes the CIA form [24], leading to complex amplitudes [34], disappears in 
tite alternative Cov LF approach [37], but in general such integrals are dependent on the light- 
frml oriaitatioii as in other covariant approaches [22-23]. To eliminate such terms, a simple 
presanption of ‘Lorentz completion’ seems to suffice to produce an explicitly Lorentz invariant 
quantity such as was shown for the pion form factor [37]; (alternative prescriptions exist in other 
LF/NP formulations [22b|). For a historical perspective, it is useful to recall that in the old- 
NPA approach too [38], a very similar result had been found for various types of triangle 
foop ampBtudte [36], despite a lade of manifest covariance [35] in that approach, but now MYTP 
|2S| on tte covariant nui-plane [37] fills this formal gap. 



3D-4D B‘S Amplitudes 


831 


1.7 Outline of Contents 

Sect.2 briefly outlines some historical approaches to an effective 3D form of strong interaction 
dynamics: Levy-Salpeter [6]; Logunov- Tavkhelidze [7a]; Blankenbecler-Sugar [9]; Todorov [7d] ; 
Weinberg [18]; Feynman et al [39]. Sect. 3 provides the theoretical framework with a short derivation 
of the BSE-SDE from an input chirally invariant Lagrangian, incorporating the original CIA form 
[25,24] of MYTP [26], on the lines of ref.[25] in terms of bilocal fields [27]. It also includes a 
derivation [28] of the dynamical mass function m(p) for an understanding of the constituent mass 
via Politzer additivity [40]. Sect. 4 collects some basic results on the null-plane formalism due to 
Leutwyler-Stern [21], especially the role of the ‘Angular Condition’ in ensuring a formal 0(3)-like 
invariance. Prom Sect.5 onwards, the focus is on MYTP [26]-orientation for bringing out its unique 
property which distinguishes it from most other approaches, viz., the 3D-4D interlinkage of BS 
amplitudes. 

Sect.5 gives a comparative view of the working of MYTP on the BSE forms in CIA [24] versus 
Cov LF [37], and outlines the derivation of the 3D BSE, as well as an explicit reconstruction of 
the 4D BS wave function in terms of 3D ingredients, with 3-momentum q = where the 

third component emerges as a P-dependent one, suitably adapted to the CIA [24] or Cov LF [37] 
respectively. Sec.6 gives a corresponding derivation for the 3D-4D interconnection for a qqq BSE 
structure under CIA conditions. Sec. 7 illustrates, through the calculation of triangle-loop integrals, 
the relative advantage of Cov LF [37] over the CIA [24] version of MYTP [26], in producing a well- 
defined structure for the pion e.m. form factor in a fully gauge invariant manner, and illustrating in 
the process the method of ‘Lorentz-completion’ for explicit Lorentz invariance, with the expected 
behaviour at high MYTP also gives a more general structure of triangle loop integrals for 
three-hadron form factors. Sect.8 summarises our conclusions. 


2 Quasipotentials And Other 3D Dynamical Equations 

The reduction of the 4D BSE for an iV-iV pair to the 3D level in the Instantaneous Approximation 
was first investigated in the non-adiabatic domain of pseudoscalar meson theory (effect of pair- 
creations included) by Levy [6a], who showed that this 3D BSE form is entirely equivalent to the 
corresponding Tamm-Dancoff equations [2] in the same (non-adiabatic) limit. 

On the other hand, Salpeter [6b] employed the adiabatic approximation (no pair creation effects) 
to give a systematic 3D reduction of the fermionic BSE, using projection operators for large and 
small components. The adiabatic approximation amounts to replacing the propagator x') 

for the exchanged meson by 


Lf{x - x') S{xo - Xq) / Ajr(x - x',Io - Xo)dixo - Xq) (2.1) 

J—inf 

and simply gives the Yukawa potential between two particles. Similarly, in the Instantaneous 
(adiabatic) Approximation, lA for short, the 4D wave function $(x) = ^(x,t) for relative motion 
of two particles becomes simply ^^(x, 0). In the momentum representation, these statements read 
respectively as 

Ajr(A:) ^ Air(k); = j dqo^i<i,qo) (2.2) 

The Salpeter 3D BSE in the lA for a relativistic hydrogen-atom is [6b]: 

{E — Hi{q) — fl2(Q))x(‘l) — J — q)^ ~ Ai_A2-]x(h) (2-3) 

where the 3D wave function x(q) is related to the corresponding 4D quantity by an equation of the 
form (2.2), and the symbols A± are energy projection operators for the large/small components, 
etc. 



832 


A N Mitra 


2.1 Logunov- Tavkhelidze Quasipotentials 

A different form of 3D reduction of the 4D BSE was proposed by Logunov- Tavkhelidze [7a] in. the 
language of Green’s functions (G-fns) for 2-particle scattering whose momentum representation 
my be written as G{pip 2 jPiP 2 ) indicated 4-momenta before and after), which satisfies a 4D 
BSE [7a]: 

(27r)®A(pi)A(p2)G(piP2;PiP2) = ^(Pl ’^Pl)^(P2 + f (^PldP2^(PlP2]PlP2)GiPlP2^ 

(2.4) 

where A(pi) = pf 4- mf , etc. Expressing this equation in c.m. (P) and relative (q) 4-momenta, 
and taking out the ^-fns due to the c.m. motion, this equation simplifies 

{ 2 ^fA{pi)A(p 2 )G{q,q';P) = S{q-q') + j dq" K{q,q")G{q" ,q-,P) (2.5) 

Next, they defined the 3D G-fii for the relative motion as a double integral w.r.t. the two time-like 
momenta: 

G(q, q';P) = I qoj q^^Giq, g'; P) (2.6) 

Now in operator notation, the 4D BSE (2.5) may be written as G = Go + GqKG, from which the 
kernel K has the formal representation G^^ “G“"^. The L-T trick [7a] now consists in using the 
double integrals on the time-like momenta as in eq.(2.6) to formally define the 3D quasipotential 
kas 

k = Ga~^-G-^ (2.7) 

which can be expanded perturbatively in the symbolic form [7a] 

k = Go ^Go^GoGo ^ “ G^^GqKGqKGqGq^ - .... (2.8) 

to any desired order of accuracy; [note that the inverse G-fhs are just the self-energy operators] . If 
V{q^^\E) is the quasi- potential to a given order of accuracy, then, the BSE satisfied by the 3D 
BS wave function tj){q) is of the form [7a]: 

(£2 _ ^ _ m^)i3{q) = J (fiq'Viq, q'; E)7piq') (2.9) 

where the ‘denominator function on the LHS arises from integrating Go= A(pi)~^A(p 2 )~^ w.r.t. 
go and reairan^ng. 


2.1.1 Narrow resonances in charged particle systems 

Within the last decade, the L-T theory [7a] has witnessed some interesting applications [41] to the 
understanding of ‘new’ narrow e+e~ resosances observed in heavy ion collisions [42]. To that end, 
the authors [41] have employed an equation of the form (2.9) which reads for this system as [41]: 

- M<Kp) = f _^.P'(^(P') (2 10) 

(2x)3 J 2w'q{M -u-w'-q + iO) 

whae w = y/m^ +p2, and g =| p - p' |. The results indicate a possible interpretation of the 
diaeryed peaks [42] as new quasi-stationary levels arising from the solution of the quasi- potential 
equation. More inteestingly, they also suggest a dose relationship of the observed states [38] with 
the von Neumann- Wigner [43] levels embedded in the continuum. 



3D-4D B-S Amplitudes 


833 


2.2 Blcinkenbecler-Sugar Equation 

Another type of quasipotential was proposed by Blankenbecler-Sugar [9], as follows. The 2-particle 
scattering amplitude T{q,q') due to a 4D potential V(q,q') in the ladder approximation satisfies 
the BSE [9]: 


Tiq,q') = -i{ 27 rr^ j d*q"Viq,q")[m‘^ + (P/2 + q"fr^[m^ + (P/2 - q")T^Tiq" ,q') (2.11) 


where the 2-particle ‘free’ G-fri is exhibited as the product of the two propagators inside the integral 
on the RHS. To express this equation in 3D form, the B-S [9] trick consists first in putting on 
the energy shell, which means that Qq = 0. and = s/4-m^, where s = is the square of the 
c.m. energy. Next, the on-shell part E 2 of the free 2-particle G-fri is obtained by taking only the 
(5-fri parts of the two propagators which gives rise only to two-particle cuts in the physical region 

E 2 iq") = 2it j ds'is' - + iP’/2 + q"f]5[m^ + {P'/2 - q"f] (2.12) 

where s' = — P'^, and P' has only a fourth component. This works out as 

W) = - q^)]-l (2.13) 

The balance R 2 of the free G-fn is not singular along the positive cut of the s- variable. If it is 
neglected in the first approximation, and only E 2 from (2.13) is substituted in (2.11), then after a 
trivial integration over g^' , the resultant 3D equation has the form 


r(g,gO 


V(a a') + - 

4 J (27r)2 y/m^ g''2 (g//2 _ g/2) 


(2.14) 


A comparison of (2.9) and (2.14) shows that although both equations are formally 3D in looks, 
there is a vast difference in their contents: The L-T [7a] form (2.9) involves only 3-momenta g = q, 
since the Hilbert space has been ‘truncated’ by integrating out over their fourth components. 
The B-S [9] form (2.14) on the other hand has 4-momenta formally throughout (no truncation of 
Hilbert space), except that they are on their mass shells ! Thus formal covariance is violated in 
both equations, although in different ways. 


2.3 Kadychevsky-Todorov Equation 


StiU another form of 3D (Lippmann-Schwinger) equation was given by Todorov [7d], following the 
Covariant method of Kadychevsky [8]. In the Todorov approach [7d], the potential Vw is defined 
as an infinite power series in the coupling constant which fits the perturbative expansion of the 
scattering amplitude for two particles of masses mi, m 2 and 4-momenta pi,p 2 and gi,g 2 before 
and after respectively. The quantity in the off-shell regime satisfies the L-S equation [7d] 


^tyCP^q) "b b^(p5q) "b 


1 j3j^ ^ti;(P,q)Pti;(P,q) 

J — ie 


(2.15) 


where the 3D quantities in the c.m. frame are defined as 


Pi = “"P2 “ p; qi — “U2 — q 


(2.16) 


and on the energy shell, the corresponding time-like quantities are 

Pio+P 20 = w = qio + g 2 o; -p^ = -q^ = uf^-, 4vPb^{w) = Xiw^ , ml, ml) (2.17) 

This equation too has strong resemblance to the L-T equation [7a]. The corresponding equation 
for the bound state wave function ^(p) is 

- b^iw))(l>{p) = - j d®ky(p,k)i/i(k) (2.18) 

Both B-S [9] and Todorov[7d] equations have been extensively employed in the literature. 



834 


A N Mitra 


2.4 Infinite-Momentum Frame: Weinberg Equation 

Weinberg [18] obs6rv6d some reiii 3 .rk 2 Lbl€ simplifications that occur when the results of old-fashioned 
perturbation theory are expressed in a reference frame in which the total 3-momentum P is very 
large. In this lim.it, the 3-momentum pn of the n-th particle may be projected parallel and 
perpendicular to P, and the results collected as follows: 

Pn = T^nP + Qn-P = Oj ^ (2.19) 

n n 

The quantity > 0 in this theory, plays the role of ‘mass’ of the n-th particle (in a 3D 
Schroedinger-type equation), and in the P -> oo limit, the rules of calculation become very simple: 
all old-fashioned perturbative diagrams passing through negative energy intermediate states van- 
ish, while for the contributing diagrams, the propagator for an intermediate state c in a transition 
from a to 6, has the form 2[sa - Sc -h where s for any state is the usual total c.m. energy 
squared: 

s = + ml]/r]n; Sa = Sb = Sc, etc. (2.20) 

n 

Momentum conservation at each vertex is 3D in content: 

(2w)3<5(A5],7)«5"(A5^q) (2.21) 


in accordance with the conservation of rj and q, eq.(2.15). The Weinberg counterpart of the L-T 
[7a], B-S [9] and Todorov [7d] equations (2.9), (2.14) and (2.18) respectively, is the integral equation 
[18] 


< q'j?' \T\qri >=< q'j?' | V" [ qr? > + 



dTj" 


< q'7?' 1 I q>7 >< q'??' I ^ I q^ > 

2{2n)^[sr)"{l — tj") — q"^ — -f ie] 


( 2 . 22 ) 


Although this equation is effectively 3D, and has considerable similarity to the corresponding 
equations of [7a,7d,9] above, there is a big difference, viz., the angular momentum is no longer a 
well-defined concept in this 3D description. This gap was bridged later by Leutwyler-Stern [21] by 
invoking the ‘angular condition’ [21], after it became clear that the Weinberg approach is equivalent 
to Dirac’s [20] null-plane dynamics; see Sect.3 below. 


2.5 The FKR Model For 2- And 3-Quark Dynamics 

Before ending this Section, we draw attention for historical reasons, to a unique paper by Feynman 
and collaborators [39], FKR for short, which gave an integrated view of 2- and 3-quark hadron 
d 3 mamics, and played a big role in shaping the direction of strong interaction physics to come. 
The importance of the FKR approach stems, among other things, from the fact that these authors 
were the first to show the way to a unified treatment of both 2- and 3-quark hadrons within a 
common dynamical framework, which was to serve as a model for the future. This paper effectively 
incorporated all the relevant aspects of quark dynamics that had been generated piecemeal in the 
Sixties, and had by and large come to be accepted, viz., the group structure 51/(6) x 0(3), the 
symmetrical quark model, and harmonic oscillator classification of hadron states (based on their 
linear plots) on the one hand [44], and the mechanism of single-quark transitions, quark 

recoil rfects, etc, on the other [45]. 

The FKR model, which made essential use of harmonic confinement, sought to give a relativistic 
meaning to the internal motion of light quarks through the following definitions of 2- and 3-quark 
Hamiltc»ians [39, 38]: 

-Km = 2(pi -h “ 2 ^ 2 )^ + Const = -f (2.23) 

-ifB=3(pi^+P2^+P3^) + ^n^y^(a:i-a:2)2 + Consf = {P^ + M%) (2.24) 

123 



3D-4D B-S Amplitudes 


835 


where xi^=idi^; p\ = etc. The quantity Ct which is postulated to be the same for both 

systems, has the significance of the universal Regge slope (« iGeV^) as observed [44] in their 
respective spectra; [Note the geometrical factors as cofficients in front of the respective kinetic 
and potential terms above]. The operators are the ‘free’ propagators (albeit with h.o. 

confinement) for the mesons and baryons, whose ‘poles’ correspond to the eigenvalues (spectra) of 
their squared masses. The presence of a perturbation 5K can be simulated in a standard gauge- 
invariant manner, by the substitutions p p^i - eV^ or p^ Pf^ - 9 ^ 5 A ^ for vector and axial 
vector couplings respectively, after rewriting as (t-p)^, while the i j transition amplitudes 
are just < hj | SK | /ij >, by standard rules of quantum mechanics. 

A major achievement of the FKR model was its success in giving two distinct types of unifi- 
cation, viz., a common framew^ork for Spectroscopy and transition amplitudes; and ii) a unified 
dynamical treatment of qq and qqq hadrons. Both these features represented landmarks in a 
dynamical understanding of the quark model, yet the FKR model failed on another count: the 
‘wrong’ sign of the time-like momenta in the gaussian wave functions for the hadrons was a disease 
which pointed to an asymmetric role of time-like (ID) momenta vis-a-vis the space-like (3D) ones. 
Attempts to cure this disease by a Euclidean treatment (via Wick rotation) [46a] failed on the 
spectroscopy front [13] which reveal only 0(3)-like spectra, while other non-covariant treatments 
[46b] were not very successful either. Nevertheless the lessons from the FKR model were significant 
pointers to the need to treat the ID time-like and 3D space-like d.o.f.’s on different footings in a 
future quest for a covariant theory [24-26]. 

3 Self-Energy And Vertex Fns Under MYTP 

As a first step towards introducing the MYTP [26] theme, we collect in this Section some essential 
machinery for the interconnection between self-energy and vertex functions via Schwinger-Dyson 
(SDE) and Bethe-Salpeter (BSE) equations, starting from a chirally invariant Lagrangian char- 
acterized by a vector- type interaction [12] as a prototype for a gluon-exchange propagator in the 
non-perturbative QCD regime [28]. To that end, we shall first outline the method of bilocal fields 
[27] to derive the equations of motion (SDE and BSE) from such a Lagrangian, following the Per- 
vushin Group’s [25] bilocal field method, under MYTP [26] conditions of covariant instantaneity 
[24]. This will be followed by a general result connecting self-energy and pion- quark vertex func- 
tions in the chiral limit, i.e., when the pion mass vanishes. This result in turn paves the way to a 
derivation [28], under MYTP [26] conditions of Covariant Instantaneity [24], of the mass function 
m{p) whose low momentum hmit is the main contributor to the constituent mass, via Politzer 
additivity [40]. 

3.1 Method of Bilocal Fields for BSE-SDE 

The effective action for a system of two interacting massless fermions constrained by MYTP [26] 
is given by [25] 

Weff[i),^p] = j d^x['lp{x)(i'rd - mo)i)ix) + d^y(:^{y)‘ip{x))K{z-^,X){il;ix)ij{y))] ( 3 . 1 ) 

where z = x — y; X — (x 4- y) /2. z-^ is the component of 2 ; transverse to the P-direction. Now 
redefine the action (3.1) in terms of bilocal fields M via the Legendre transformation [27] on the 
second term to give 

j d:^x^yM{x,y)K-\z^,X)M{x,y) + 

Then in an obvious. short-hand notation [25b], (3.1) may be written as 

W,f}[M] = (#,(-Go ' + ^) - \{M,K-^M) ( 3 . 3 ) 


J <I^xd^y{ip{x)‘tp(y)M{x,y) (3.2) 




836 


A N Mitra 


where Go is the inverse Dirac operator for the free fermion field. After quantization over Nc fermion 
fields and normal ordering, the action takes the form [25b] 

inf ^ 

W,jf{M] = (3.4> 

n=l 

where # = Go-M is a matrix in {x,y) space, and its successive powers are defined in the standard 
matrix fashion. Now for the quantization of the action (3.4), its minimum is given by 


r-l 5WQiM) _ 

' 5M ~ 


-K-^M + 



= 0 . 


(3.5, 


The corresponding ‘classical’ (lowest order) solution for the bilocal field is S(i — y) which depend.*- 
only on the difference i - y due to the translation inwiance of the \'acuum solutions. Next expand 
the action (3.4) around the point of minimum = S + Ai', and denote the small fluctuations 
as a sum over the complete set of ‘classical’ solutions F. Then in the next order of extremum. 


we have: 


sHVqCs + m') 
SM^ 


\M'=e f — 0 


(3.6) 


Eqs. (3.5-6) give respectively the SDE for S and BSE for F: 


S(z -y) = (r - y) + iK{z-, X)G-s:{x - y); 

F = iK{z-^,X) j ct^zid^z^G-six -zi)r{zi,Z2)G-£iz2-y) 


(3.7) 


which describe the spectrum of the fermions and composites respectively. In momentum space 
these equations for the mass operator and vertex function are 


S(p) = mo + 1 1 -^Vip - qhvG^iqhm = P^I\P\ (3.8) 

m = i I - q)lf,[GEiq + P/2)T{q^)G^{q - P/2)]jv (3.9) 

where Os(^)=( 7 g - V is the scalar part of the kernel K with 3D support; k is the 

transverse part of k w.r.t. the direction rip, of the total 4-momentum Pp. 


3.2 Self-Energy vs Vertex Fn in Chiral Limit 

The formal equivalence of the mass-gap equation (3.8) and the BSE (3.9) for a pseudoscalar meson 
in the diirai limit [12] will now be demonstrated by adapting them to a non-perturbative gluon 
exchange propagator [28] with an arbitrary confining form D{k) (not just the perturbative form 
The SDE, eq,{3.8), after replacing the color factor Ai.A2/4 by its Casimir value 4/3, and a 
relabelling of symbols [28], now reads 

S(p) = |i(2^)"" j d*kD{k)^ijS'F{p - khf,; (3.10) 

5|r is the fun propagator related to the mass operator E(p) by 

S(p) -f i^,p = (p) = A(p^)[ij,p + m{p^)] (3.11) 

thus defining the ma^ function m(p^) in the chiral limit me = 0. In the same way, eq.(3.9) for 
the vertex function Fif of a qq hadron (H) of 4-momentum Pp made up of quark 4-momenta 
Pi ,2 = F/2 ± g reads as 

Ttf (g, P) = - ^i(2jr)-'‘ j <Pq'D{q - q'^ySpiq' + P/2)rjj(g', P)5f (g' - P/2)'yf} (3.12) 



3D-4D B-S Amplitudes 


837 


The complete equivalence of (3.10) and (3.12) for the pion case in the chiral limit 0 is easily 
established. Indeed, with the self-consistent ansatz Th = 75 T(g), eq.(3.12) simplifies to 

r(g) = J d^k'YvS'pik - q)r{q - k)S'p{q - k)jfiDik) (3.13) 

where the replacement q' = q — k has been made. Substitution for S'p from (3.11) in (3.13) gives 



D{k)T{p-k) 

- k){Trfi{{p - kY) + (p - kY) 


(3.14) 


where we have relabelled q p. On the other hand substituting for Sp (3.11) in (3.10) gives for 
the mass term of E(p) the result 


A(p^)m(p^) = -^i(27r) * J dYk 


D{k)A(q!)m{q'^) 
A^{q'){Tn?{q'^) + g'^) 


(3.15) 


where q' = p- k. A comparison of (3.14) and (3.15) shows their equivalence wdth the identification 
r(^) = A{q)m{q^), i.e. the identity of the vertex and mass functions in the chiral limit, provided 
A = 1, (which corresponds to the Landau gauge; see [32]). Although obtained here in the context 
of MYTP [26] this result is independent of this ansatz. A more explicit gauge theoretic derivation 
of the equations for the self-energy and vertex functions is given in [32]. 


3.3 Dynamical Mass As DBxS Solution of SDE 

We end this Section with the definition of the ‘dynamical’ mass function of the quark in the chiral 
limit (Mtt = 0) of the pion-quark vertex function r(^), in the 3D-4D BSE framework [24,28]. The 
logic of this follows from the BSE-SDE formalism outlined above, eqs.(3.10-15), for the connection 
between eq.(3.15) for m(p) and eq.(3.14) for T{q) in the limit of zero mass of the pseudoscalar. So, 
setting M = 0 in the (unnormalized) Hqq vertex function Vh this quantity may be identified with 
the mass function m(p), in the limit P^ = 0, where is the 4-momentum of either quark; (note 
the appearance of the ‘hatted’ momentum). The result is [28,32] 

m(p) - [w(p); \/2p.n]— 2-5 — (f>(p) (3.16) 

m‘ 

under CIA and Cov LF respectively. The normalization is such that in the low momentum limit, 
the constituent ud mass rriq is recovered under CIA [28], while the corresponding ‘mass’ under Cov 
LF is p+ [32]. 


4 Null-Plane Preliminaries 

The Weinberg infinite momentum method received a major boost through an understanding of 
Bjorken scaling [47a] in deep inelastic scattering, as weD of the Feynman parton picture [47b] in 
the same process. The similarity of the P inf and the null-plane descriptions became clear with 
the demonstration by Susskind [48] of the U (2) structure of the LF /NP language wherein the role 
of ‘mass’ is played by the combination p+ = po d-ps, and subsequently a more complete formulation 
of null-plane dynamics by Kogut and Soper [49] within the Hamiltonian formalism in the context 
of field theory. 

In a different direction, efforts were made to extend the Lorentz contraction ideas to finite 
momentum frames, designed to bring out the effect of Lorentz contraction on cluster form factors 
as a result of motion [50]. In the respect, the role of the Breit frame is particularly interesting 
since it gives the best possible overlap [50b] between the initial and final clusters. The Lorentz 
contraction factors [50] in turn are the key to an understanding of ‘dimensional scaling’ [51], 
especially in a ‘symmetrized’ version [50d] of the Breit firame [50b] which exactly reproduces the 



838 


A N Mitra 


correct ^power countiiig’ [51]. And the Weinberg result [18] is duly recovered in the P inf limit, 
giving rise to the null-plane dynamics; (for more details of these results, see [38])- A more complete 
formulation of LF/NP dynamics, albeit with a finite number of degrees of freedom was given by 
Leutwyler-Stern [21] which comes closest to the original Dirac [20] spirit, and is summarized below. 

4.1 Leutwyler-Stern Formalism 

Leutwyler-Stern [21] employed a Hamiltonian approach for investigating the properties of a rel- 
ativistic 2-body system with a finite number of d.o.f.’s, and postulating a general criterion of 
‘covaiiance’ in the form of an operator relation among the mass and spin operators of the sys- 
tem. Their formalism is based on the maximum ‘stability group’ (in the Dirac [20] sense) for the 
null-plane surface xo +X3 = 0, which gives rise to the following seven ‘kinematicaP generators [38]: 

/C = (PuP2,P+,EuE2,K3.J3); 2P±^Po±P3; (4.1) 

2 Ei = Ki + J2? 2E2 = K2 — Jlj Ji = 2 ^ijk^jkj Ei = M^i. 

The matrix elements of K form a closed algebra (r, s = 1, 2): 

[Ks.Er] = -iEr; [K 3 , P+] = -iP^, [J3,F,] = (4.2) 

[Js^Pr] = +ier,Ps] [Er,Ps] = -iSrsP+ 

On the other hand, there are 3 ‘Hamiltonians’ which can be chosen in one of several different ways. 
To that end, it is necessary to introduce certain rotation operators U defined by | n >= | n >, 

on a rest system j n >, but extended to states | p, n > defined by 

j p, n >= Exp[-il3iEi - i^E 2 - ifizKz] | n >; = Pr/p+; /?3 = In(2p4-/M) (4.3) 

such that I commutes with the algebra of K. More explicitly, 

Ji I p,n > = Exp[-ifiiEi - i02E2 - i(izK 3 ]Ji | n >; (4.4) 

[/j, Jj] = [«^ij lx] ” 0. 

In particular, 

Iz = J3 + {EiP 2 -E 2 Pi)/P+ = {Wo + W 3 )/P+; . (4.5) 

is the Pauli-Lubanski operator, and Mir = Wr - PtW+ /P+, where r = 1, 2 . One thus arrives 
at the ‘dynamical group’ D = {M, U), or (M^, M/j), which has the structure of C/(2) [48], because 
of (3.4). D is really a 3-member group, since I 3 already belongs to K. by virtue of (3.5). For a 
particular significance of I, note its connection with the non-relativisic Galilei-invariant algebra 
generated by the momentum P and Galilei boosts K, viz., 

I = J - 1 - m~^K X P; {Tn = mass) (4.6) 

Now Galilei invariance of a system is equivalent to the condition that the corresponding dynamical 
a^br^ constitute a unitary representation of t/(2). In the relativistic case, there is a superficial 
similarity to the £/(2) structure of D, but unlike the NR case, only the component I 3 is now 
‘kinematical’, by virtue of (3.5), while /i ,2 are ‘dynamical’, and do not have explicit representations 
by tt^ms^ves. L-S [ 21 ] sought to bridge this gap by imposing the ‘covariance’ requirement in the 
form of an ‘angular condition’ for the operators It as follows: 

XiMIi + X 2 MI 3 + Xi J3; n = Piu -1- P 2 X 3 + P+x- 
wbkh can be shown to be valid on the null-plane x+ = xo + X 3 = 0 [38]. 


(4.7) 



3D-4D B-S Amplitudes 


839 


The L-S formalism [21] provides a compact support to the longitudinal momentum z of 2- 
particle system with constituent 4-momenta pi,2* 

2zP+=pi+-p2+; P-f =piH- -f-p2+; (4.8) 

The internal wave function 0 is defined by 


< Pi>P2 \P,4> >= 2P^5^(pi -{-p2 “ P)<;^(qx,:3;) 
where <?> is a matrix in helicity space {h\ with the norm: 

dPqxdz 


<(j)\4)>=:l r X 1 4)h>h-{ci±rZ) I 

4 ^ h'h'' 


(4.9) 


(4.10) 


The L-S [21] structure formally allows the introduction of a 3- vector q and angular momentum L 
for the internal motion of a composite of mass M with (equal) constituent masses rriq as 


q = (qjLjMrc); L = -iqx Vg 


(4.11) 


with the phase space 


(Pq±dz _ 4dPq 


= 4(mg + q^). 


(4.12) 


With these definitions of q and L, the theory formally preserves the concept of L-invariance of a qq 
system despite the apparently asymmetric treatment meted out to the transverse and longitudinal 
components of 3-momenta in the NP or P = inf formalism [18, 25, 48-9]. This invariance can 
be traced to angular condition (4.7). Incidentally, the historic FKR model [39], despite its other 
defects, was found by L-S [21] to satisfy the angular condition (4.7). 

An alternative but more pedigogical recipe to achieve the same end was given in [38] via the 
simpler condition P.q = 0, to be consistently imposed between the total (P^) and relative (g^) 
4-momenta of a qq system, as outlined in subsection 4.2 below [38]. 


4.2 An Alternative ’’Angular Condition” P,q = 0 

For unequal masses mi ^2 of the (quark) constituents with 4-momenta pi, 2, the total (P) and relative 
(q) 4-momenta are given by the Wight-Gaerding definitions [52] 

Pi,2 = AifiP ± ?; 2mi,2 = 1± (4-13) 

where M = %/— P^ is the composite mass. The condition Pg = 0 is satisfied on the mass shells 
2 + Pi 2 = 9 of the respective constituents, by virtue of the Wightman-Gaerding definition 
(4.i3). 

To lin k the condition P.q = 0 with the construction of an eflfective 3-vector in the null-plane 
language, so as to preserve the invariance of the angular momentum concept, note that this con- 
dition translates to the relation g_ = -g+M^/P^ which expresses the component g_ in terms of 
g+, in a firame Px = 0, since in this frame, P+P- = on the mass shell of the composite. [The 
collinearity condition is not a restriction for a two-body system]. This relation then allows a defini- 
tion of the 3-momentum q with the components (qx,g3)j with qz = Mq.i.JP+y which preserves the 
meaning of L in the sense of L-S [21], together with NP covariance. For any other internal 4- vector 
for the composite, a similar 3- vector A may be defined as (Ax, A3), with A3 = MA^JP-^^ via 
the condition AP = 0. Examples of A^ are polarization vectors, Dirac matrices, etc. Using these 
techniques, null-plane wave functions of the L-S type [21] have been constructed and applied to 
hadronic processes via quark loops [34]. A more formal mathematical basis for this prescription 
comes from MYTP [26] on the covariant null-plane [37]; see Sect. 5 below. 



840 


A N Mitra 


5 3D-4D BSE Under MYTP: Scalar/Fermion Quarks 

We now come to our objectives (B) and (C), viz., 3D-4D interiinkage of BS amplitudes brought 
about by MYTP [26], and a unified treatment of qq [24,37] and qqq [53] systems under MYTP 
conditions. The full calcuiational details with 4-fermion couplings via gluonic propagators have 
been collected in a recent review [32]. However, to bring out the basic mathematical structures, 
we shall derive the 3D-4D interconnection with spinless quarks for 2- and 3-quark systems in this 
and the next sections respectively. Further, we shall consider two MYTP methods in parallel for 
comparison: i) Covariant Instantaneity Ansatz (CIA) [24-25]; ii) Covariant LF/NP Ansatz (Cov 
LF) [37], to bring out the structural identity of the resulting BSE’s for a qq system. This will be 
followed by a reconstruction of the 4D BS vertex functions for both types [24, 37] as basic building 
blocks for 4D quark loop calculations. 

5.1 3D-4D BSE Under CIA: Spinless Quarks 

For a self-contained presentation, with unequal mass kinematics [24], let the quark constituents 
of masses mi , 2 and 4-momenta pi ,2 interact to produce a composite hadron of mass M and 4- 
momentum P^. The relative 4-momentum is related to these by 

pi 2 = ± 2mi,2 = 1 ± (mf — (5.1) 

These Wightman-Garding definitions [52] of the fractional momenta mi , 2 ensure that q.P = 0 on 
the mass shells = Oof the constituents, though not off-shell. Now define q^ = q^-q.PP^/P'^ 

as the relative momentum transverse to the hadron 4-momentum which automatically gives 
q,P = 0, for all values of If the BSE kernel K for the 2 quarks is a function of only these 
transverse relative momenta, viz. K = K{q,q^), this is called the “Cov. Inst.Ansatz (CIA)” [24] 
which accords with MYTP [26]. For two scalar quarks with inverse propagators Ai, 2 , this ansatz 
gives rise to the following BSE for the wave fn #(g,P) [24]: 

i(2^)^AxA2$(g.P) = j d^<^K{q,mq‘,P)- Ai,2 = +pl^ (5.2) 

The quantities mi , 2 are the ‘constituent’ masses which are strictly momentum dependent since 
they contain the mass function m(p) [12,28], but may be regarded as constant for low energy 
phenomena: m(p) = m(0). Further, under CIA, m(p) = m(p), a momentum-dependence which is 
governed by the DBxS condition [28] (see below). 

To make a 3D reduction of eq.(5.2), define the 3D wave function (j>{q) in terms of the longitudinal 
momentum Ma as 

J Mda^q,P); Ma = Mq.PjP^ (5.3) 

using which, eq.(5.2) may be recast as 

ii27r)*AiA2^q,P) = J (Pq'K{q,q')<j>{q'y, (l^q' = cFq' Mda' (5.4) 

Next, divide out by AjA 2 in (5.4) and use once again (5.3) to reduce the 4D BSE form (5.4) to 
the 3D form ^ ^ 

i27^?D{q)m = / ^ ( 5 . 5 ) 

Here D{q) is the 3D denominator function associated with the like wave function ^{q). The 
int^ration over da is carried out by noting pole positions of Ai ,2 in the <r-plane, where 

Ar,2 = wi.2^ - M^(jfii,2 ± ct)^; wi,2^ = ml 2 + 9^ (5.6) 

The pde positions are given for Ai ,2 = 0 respectively by 

Jif(cr + TOi) = iu;, ipie; M (a — m 2 ) = ±(V 2 ^ ie 


(5.7) 



3D-4D B-S Amplitudes 


841 


where the (±) indices refer to the lower/upper halves of the cr- plane. The final result for D{q) is 
expressible symmetrically [24]: 


D{q) = MMq)\ ^ 


^12 

UJl 


(5.8) 


\Doiq) 


=e- 


4M2 5 


A = — 2M'^{Tn\ + m\) + {m\ — 


(5.9) 


The crucial thing for MYTP[26] is now to observe the equality of the RHS of eqs (5.4) and (5.5), 
thus leading to an exact interconnection between the 3D and 4D BS wave functions [24]: 


T{q) = AiA 2 §(q,P) = (5.10) 

Eq.(5.10) determines the hadron-quark vertex function r{q) as aproduct D<j>oithe 3D denominator 
and wave functions, satisfying a relativistic 3D Schroedinger-like equation (5.5). The simultaneous 
appearance of the 3D form (5.5) and the 4D form (5.4), leading to their interconnection (5.10), 
reveals a two-tier character: The 3D form (5.5) gives the basis for making contact with the 3D 
spectra [13], while the reconstructed 4D wave (vertex) function (5.10) in terms of 3D ingredients 
D and <p can be used for 4D quark-loop integrals in the standard Feynman fashion. Note that 
the vertex function F = D(p/ (2i7r) has a general structure, independent of the details of the input 
kernel K. Further, the D-function, eq.(5.8), is universal and well-defined off the mass shell of 
either quark. The 3D wave function (j) is admittedly model- dependent, but together with D(g), it 
controls the 3D spectra via (5.5), thus .offering a direct experimental check on its structure. Both 
functions depend on the single 3D Lorentz-covariant quantity whose most important property 
is its positive definiteness for time-lke hadron momenta (M^ > 0). 


5.2 Cov LF/NP for 3D-4D BSE: Fermion Quarks 

As a preliminary to defining a 3D support to the BS kernel on the light-front (LF/NP), on the 
lines of CIA [24], a covariant LF/NP orientation [37] may be represented by the 4- vector n^, 
as well as its dual n^, obeying the normalizations = 0 and n.n = 1. In the standard 

NP scheme (in euclidean notation), these quantities are n == {001]-i)/y/2 and n = (001;i)/v^, 
while the two other perpendicular directions are collectively denoted by the subscript ± on the 
concerned momenta. We shall try to maintain the n-dependence of various momenta to ensure 
explicit covariance; and to keep track of the usual NP notation p± =Po ±P 3 > our covariant notation 
is normalized to the latter as p+ = n.p\/2; p- = -n,py/ 2 ^ while the perpendicular components 
continue to be denoted by px in both notations. 

In the same notation as for CIA [24], the 4th comporient of the relative momentum q = 
m 2 Pi ” niiP 2 , that should be eliminated for obtaining a 3D equation, is now proportional to 
Qn = n.q, as the NP analogue [37] of P.qP/P^ in CIA [24], where F = pi is the total 
4-momentum of the hadron. However the quantity q — qnTi is still only q±^ since its square is 
q^ — 2n.gn.g, as befits q\ (readily checked against the ‘special’ NP frame). We still need a third 
component ps, for which the correct definition turns out to be [37] qs^ — zPrJi^j where Pn—P^n and 
z = q.njP.n, which checks with q^ = q\+z^M^, We now collect the following definitions/results: 

q^ zzz q — q-nP'] Q ” ^JL xPfiTlj X — q^Tlj P»7l^ P ilf , (5.11) 

qn,Pn = n.(g,P); q.n = g.n; q.n = 0; P±.q± = 0; 

P.g = Png.n H- P.nqn] Pg = Pn^-^; 

Now in analogy to CIA, the reduced 3D BSE (wave-fn <j)) may be derived from the 4D BSE (5.2) 
for spinless quarks (wave-fn #) when its kernel K is decreed to be independent of the component 
gn, i.e., K = K{qA% with g = (gx,Pnn), in accordance with MYTP [26] condition imposed on 
the null-plane (NP), so that d^q — (Pqxdq^dqn- Now define a 3D wave-fn (j>(q) = f dqn^(q) 7 ^ 



842 


A N Mitra 


CNPA counterpart of the CIA definition (5.3), and use this result on the RHS of (5.2) to give 

i(27r)‘‘#(g) = Ar^-^2"^ J d^q'Kiq,q')^{q') (5.12) 

which is formally the same as eq.(5.4) for CIA above. Now integrate both sides of eq.(5.12) w.r.t. 
dqn to give a 3D BSE in the variable q: 

Dn{q)<l>{q) = J d^qx!dqzK{q,^)4>{q') (5.13) 

which again corresponds to the CIA eq.(5.5), except that the function Dn{q) is now defined by 

j £i?„Ar^A2~‘ =2ziD~\q) (5.14) 

and may be obtained by standard NP techniques [38] (Chaps 5-7) as foUows. In the plane, the 
poles of Ai ,2 lis on opposite sides of the real axis, so that only one pole will contribute at a time. 
Taking the A2-pole, which gives 


2g„ = -\/2g_ 


^2 + (gJ- ~ 'di2P)^ 

7712 .P.n “ 


(5.15) 


the residue of Ai works out, after a routine simplification, to just 2Rq = 2P.nqn + 2Pnq.n, after 
using the collinearity condition Pj_.gx = 0 from (5.11). And when the value (5.15) of qn is 
substituted in (5.14), one obtains (with PnP-n = — Af^/2): 




q^ =q]_ + 


X = q.n/P.n 


(5.16) 


Now a comparison of (5.12) with (5.13) relates the 4D and 3D wave-fns: 

27ri$(Q) = D„(g)Ar'A2-V(5) 


(5.17) 


as the Cov LF counterpart of (5.10) which is valid near the bound state pole. The BS vertex 
function now becomes T = DnX<i>f{2'Ki). This result, though dependent on the LF/NP orientation, 
is nevertheless formally covariant, and closely corresponds to the pedagogical result of the old 
LF/NP formulation [38], with Dn ^ D^. 

A 3D equation similar to the covariant eq.(5.13) above, also obtains in alternative LF formu- 
lations such as in Kadychevsky-Karmanov [22b] (see their eq.(3.48)). However the independent 
4- vector (which has no counterpart in [22b]), makes this a manifestly covariant 4D formula- 
tion without ne^ for explicit Lorentz transformations [22b]. The ‘angular condition’ [21] is also 
triviaEy satisfied by the effective 3-vector appearing in the 3D BSE (5.13). A more important 
contrast from otha: null-plane approaches is that the inverse process of reconstruction of the 4D 
hadron-quark vertex, eq.(5.17)), has no counterpart in them [22-23], as these are basically 3D 
oriented, not mco uersa 

For fermion quarks with gluonic propagators, the MYTP formulation needs no new principles, 
except for certain technical details involving slight modifications [54] of the BSE structure for easier 
handling; [32] for detailed steps. The full 4D wave function $ (P, q) may be expressed as a 4x4 
matrix |38,32]: 

$(P, q) = Spipi )r(qhDSFi~P2y, riq) = Nh[ 1-, Pn/M]Diq),i>{q)/2iT (5.18) 

whare 70 is a Dirac matrix which equals 75 for a P-meson, *7^ for a V-meson, 27^75 for an A- 
njeson, etc. The factors in square brackets stand for CIA and Cov LF values respectively. Nh 
rqjr®ente the hadron normalization. 



3D-4D B-S Amplitudes 


843 


6 The qqq BSE: 3D-4D Interlinkage 

We now come to the aspect of MYTP [26] that governs the inter-relation of 3D and 4D Bethe- 
Salpeter amplitudes for 3-body (ggg)-systems, in keeping with a perceived ‘duality’ between meson 
(qq) and baryon {qqq) systems which necessitates a parallel treatment between them. In this 
respect a fairly comprehensive review of baryon dynamics as a 3-body relativistic system with full 
permutation symmetries in all relevant degrees of freedom [55] has been given recently [32]. These 
include: A detailed correspondence [56] between qqq and quark-diquark wave functions; Complex 
HO techniques for the qqq problem [57]; fermionic qqq BSE with the same gluon propagator for pair 
qq interactions [29] as employed for qq systems [28], except for reduction by half due to the color 
factor; and Green’s function methods for 3D reduction of the 4D BSE form, plus reconstruction 
of the 4D qqq wave function [53], on the lines of the qq problem [24]. Within the formalistic scope 
of this Article however, we shall merely dwell on the last item, viz., Green’s fh techniques [53] for 
a 3D reduction of the 4D BSE, plus reconstruction of the 4D wave function, for a qqq system for 
three identical spinless quarks, keeping in forefront the issue of connectedness [58] in a 3-particle 
amplitude whose signal is the absence of any 5-function in its structure; (for a detailed perspective, 
see [32]). 

6-1 Two-Quark Green’s Function Under CIA 

As a warm up to the method of Green’s functions (G-fhs), we first derive the 3D-4D interconnec- 
tion for the corresponding G-fns for 2-particle scattering of two identical spinless particles, before 
moving on to the 3-body problem in the next 2 subsections. For simplicity we shall consider the 
G-fns near the bound state poles, so that the inhomogeneous terms may be dropped. In the no- 
tation and phase convention of Section 5, the 4D qq Green’s fn G{pip 2 ;piP 2 ^) near a bound state 
satisfies a 4D BSE (no inhomogeneous term): 

i(27r)^G(piP2;Pi'P2') = J dpi"dp2"K(j)iP2;pi"p2")G(jpi"p2'';pi'p2'y, (6.1.1) 

where 

Ai=pi^+mg^, (6.1.2) 

and TUq is the mass of each quark. Now using the relative 4- momentum q = (pi — p2)/2 and 
total 4-momentum P = pi +P 2 (similarly for the other sets), and removing a ^-function for overall 
4-momentum conservation, from each of the G- and AT- functions, eq.(6.1.1) reduces to the simpler 
form 

i(27r)^G(g.g') = j dq" Mdtj" K{q,q")G{q" ,q') (6.1.3) 

where q^ = q^i — crP^, with a = {q.P)/P^, is effectively 3D m content (being orthogonal to P^). 
Here we have incorporated the ansatz of a 3D support for the kernel K (independent of cr and tr^), 
and broken up the 4D measure dq” arising from (6.1.1) into the product dq"Mda” of a 3D and a 
ID measure respectively. We have also suppressed the 4-momentum P^ label, with (P^ = — M^), 
in the notation for G{q.q'). 

Now define the fully 3D Green’s function G{q^q‘) as [53] 

Gilq') = 1 1 M^dada'G(q,q') (6.1.4) 

and two (hybrid) 3D-4D Green’s functions G{q,q'), G{q,q') as 

G{q,q') = I MdaGiq,q'y, G{q,q') = j Mda'G{q,q'y, (6.1.5) 
Next, use (6.1.5) in (6.1.3) to give 

i{27r)^G{q,q‘) = Ar^Aa"' J dq''Kiq,q")G{q",q') 


( 6 . 1 . 6 ) 



844 


A N Mitra 


Now integrate both sides of (6.1.3) w.r.t. Mda and use the result 

j =2mD~^{qy, L>(g) = 4a)(w^ - A/^/4); = rrig^ + f (6.1.7) 

to give a 3D BSE w.r.t. the variable q, while keeping the other variable q' in a 4D form: 

(2w)3G(g, q') = D-^ I dq"Kil q")G{f, q') (6.1.8) 

A comparison of (6.1.3) with (6.1.8) gives the desired connection between the full 4D G-function 
and the hybrid G(Q,g')-fnnction: 

2TriGiq,q') = D{q)Ai-^A 2 ~^G{q,q') (6.1.9) 

Again, the symmetry of the left hand side of (6.1.9) w.r.t. q and q' allows rewriting the right hand 
side with the roles of q and q' interchanged. This gives the dual form 

2iriG{q, q') = D(g')Ai'”' A 2 '"'G( 9 , ?') (6-1.10) 

which on integrating both sides w^r.t, Mda gives 

2mG(q,q') = D(g')Ai'"'A2'“'G(9,g')- (6-1-11) 

Substitution of (6.1.11) in (6.1.9) then gives the symmetrical form 

(27nf G(g, q') = D(g)Ai A 2 -'G(g, g')il( 9 ')Ai'~' A 2 '"' (6.1.12) 

Finally, integrating both sides of (6.1.8) w.r,t. Mda\ we obtain a fully reduced 3D BSE for the 
3D Green’s function: 

(27r)3(7(g,40 = I?”"' (a J dq^^K{q,^^)G{q\q^) (6.1.13) 

Eq.(6.1.12) w’hich is valid near the bound state pole, expresses the desired connection between the 
3D and 4D forms of the Green’s functions; and eq{6.L13) is the determining equation for the 3D 
form. A spectral analysis can now be made for either of the 3D or 4D Green’s functions in the 
standard manner, viz., 


G{q, q') = X; ^n(r. P)KiQ'-, P)KP^ + M^) (6.1.14) 

n 

where # is the 4D BS wave function. A similar expansion holds for the 3D G-function G in terms 
of 4>n{q)- Substituting these expansions in (6.1.12), one immediately sees the connection between 
the 3D and 4D wave functions in the form: 

27ri#(g, F) = Ar^ A 2 “^D(g)^(g) (6.1.15) 

whence the BS vertex function becomes F = D x <^/(2xi) as found in [24]. We shall mak p free use 
of these results, taken as qq subsystems, for our study of the qqq G-fiinctions in subsects.6.2-3. 

6.2 3D BSE Reduction for qqq G-fn 

As in the two-body case, and in an obvious notation for various 4-momenta (without the Greek 
suffixes), we consider the most general Green’s function G(piP 2 P 3 ;Pi'p 2 'j 53 ') for 3-quark scattering 
neor the bound state pole (for simplicity) which allows us to drop the various inhomogeneous terms 
frffln the beginning. Again we take out an overall delta function 5{pi + p 2 + ps - P) from the 
G-funcdon and work with two internal 4-momenta for each of the initial and final states defined 
as foUows [54b]: 

\^3=Pi-P2; 3% = -2p3+Pi+J>2 


( 6 . 2 . 1 ) 



3D-4D B-S Amplitudes 


845 


P=Pl+P2+P3 =Pl'+p2'+P3' (6.2.2) 

and two other sets and C 25 P 2 defined by cyclic permutations from (6.2.1). Further, as we 
shall consider pairwise kernels with 3D support, we define the effectively 3D momenta pi, as well 
as the three (cyclic) sets of internal momenta ii.rju (i = 1 , 2 , 3 ) by [ 54 b]: 

Pi=Pi- I'iP ; - SiP ; fii - UP (6.2.3) 

z.i = (P.Pi)/p2 ; Si = {P4i)/P^ ti = {P.m)fP^ (6.2.4) 

■\/3s3 = i'i — ^'2 3^3 = - 2 i /3 + i/I + 1/2 (+cyclicpermutations) (6.2.5) 

The space-like momenta pi and the time-like ones I'i satisfy [54b] 

Pi + P 2 + P3 = 0 ; = I (6.2.6) 


Strictly spoaking, in the spirit of covariant instantaneity, we should have taken the relative 3D 
momenta ^, 7 } to be in the instantaneous frames of the concerned pairs, i.e., w.r.t. the rest frames 
of Pij = Pi +pj; however the difference between the rest frames of P and Pij is small and calculable 
[54b], while the use of a common 3-body rest frame (P = 0) lends considerable simplicity and 
elegance to the formalism. 

We may now use the foregoing considerations to write down the BSE for the 6-point Green’s 
function in terms of relative momenta, on closely parallel lines to the 2-body case. To that end 
note that the 2-body relative momenta are qij = (pi —pj)l2 = \/3^jfe/2, where (ijk) axe cyclic 
permutations of (123). Then for the reduced qqq Green’s function, when the last interaction was 
in the (ij) pair, we may use the notation together with ‘hat’ notations on these 

4-momenta when the corresponding time-like components are integrated out. Further, since the 
pair is permutation invariant as a whole, we may choose to drop the index notation from 
the complete G-function to emphasize this symmetry as and when needed. The G-function for the 
qqq system satisfies, in the neighbourhood of the bound state pole, the following (homogeneous) 
4D BSE for pairwise qq kernels with 3D support: 

i(2ir)"G(^77;e'77') = (6.2.7) 

where we have employed a mixed notation {qi 2 versus ^ 3 ) to stress the two-body nature of the 
interaction with one spectator at a time, in a normalization directly comparable with eq.(6.1.3) 
for the corresponding two-body problem. Note also the connections 

<712 = V^S3/2; gi2 = V^3/2; ps = -ps, (6.2.8) 


The next task is to reduce the 4D BSE (6.2.7) to a fully 3D form through a sequence of integrations 
w.r.t. the time-like momenta Sj, U applied to the different terms on the right hand side, provided 
both variables are simultaneously permuted. We now define the following fully 3D as well as mixed 
(hybrid) 3D-4D G-functions according as one or more of the time-like rj variables are integrated 


out: 


which is 53 -symmetric. 


J j f J dsdtds'dt'G{^i];^'ri') 

{6.2.9) 

j J dtsdts'Gi^rji^'r}')] 

(6.2.10) 

G3div,i'v')= 1 1 dssdss'Gi^m^'ri'y, 

(6.2.11) 


The last two equations are however not symmetric w.r.t. the permutation group S 3 , since both the 
variables are not simultaneously transformed; this fact has been indicated in eqs.( 8 . 2 . 10 - 11 ) 
by the sufiSx “ 3 ” on the corresponding (hybrid) ( 5 -functions, to emphasize that the ‘asymmetry’ 



846 


A N Mitra 


is w.r.t. the index '‘ 3 '’ . W'e shall term such quantities “Sa-indexed” , to distinguish them from Sz- 
symmetric quantities as in eq.(6.2.9). The full 3D BSE for the G- function is obtained by integrating 
out both sides of ( 6 . 2 . 7 ) w.r.t. the sf-pair variables dsidsj'dudt/ (giving rise to an Sa-symmetric 
quantity), and using ( 6 . 2 . 9 ) together with ( 6 . 2 . 8 ) as follows: 

{ 2 rtfG{im i'n') = E ^“'(912) / dq‘{,K{qi2, !'>?') (6.2.12) 

This integral equation for G which is the 3 -body counterpart of (6.1.13) for a qq system in the 
neighbourhood of the bound state pole, is the desired 3D BSE for the qqq system in a fully connected 
form, i.e., free from delta functions. Now using a spectral decomposition for G « 

Giimi'q') = E Miq; PWnii'n'-. P)tiP^ + M^) (6.2.13) 

n 

on both sides of ( 6 . 2 . 12 ) and equating the residues near a given pole gives the desired 

equation for the 3D wave function (i> for the bound state in the connected form: 

= E / dq'MQn, qMefl P) ( 6 - 2 - 14 ) 

123 

Now the 53 “Symmetry of in the (|,-,i 7 x) pair is a very useful result for both the solution of (6.2.14) 
and for the reconstruction of the 4D BS wave function in terms of the 3D wave function (6.2.14), 
as is done in the subsect .6.3 below. 


6.3 Reconstruction of 4D qqq Wave Function 

We now attempt to re-express the 4D G-function given by (6.2.7) in terms of the 3D G~function 
given by ( 6 . 2 . 12 ), as the qqq countkpaxt of the qq results (6.1.12-13). To that end we adapt the 
result ( 6 . 1 . 12 ) to the hybrid Green’s function of the (12) subsystem given by G 377 , eq.(6.2.10), in 
which the 3-momenta play a parametric role reflecting the spectator status of quark #3, 
while the active roles are played by qi 2 yqn = V 3 (f 3 j^ 30 / 2 ? for which the analysis of subsect. 6.1 
applies directly. This gives 

(2w»r(?3^(6%; = D(9i2)Ar'A2-'G(|3%; i3n3)D(qi2)^i'~"^2'~" (6-3.1) 

where on the right hand side, the ‘hatted’ G-fimction has full Ss-symmetry, although (for purposes 
of book-keeping) we have not shown this fact explicitly by deleting the suffix ‘ 3 ’ from its arguments. 
A second relation of this kind may be obtained i&rom (6.2.7) by noting that the 3 terms on its right 
hand side may be expressed in terms of the hybrid Gs^ functions vide their definitions ( 6 . 2 . 11 ), 
tc^ether with the 2 -body interconnection between (^ 3 , ^ 3 ^) and {iz,iz) expressed once again via 
(6.3.1), but without the ‘hats’ on tjz and 773 '. This gives 

(v^iri)2G(6^;C3'%') = 

= E A2-'(7rtV^) / dq'f,Mdai2"K{q,2, ^ 2 )^( 6 "%"; 6 '%') 

== E'^(^^ 2 )Ai ^ (6.3.2) 

123 

whore the ^cond form exploits the symmetry between rj and q’ . 

At this stage, iinlil® the 2-body case, the reconstruction of the 4 D Green’s function is not yet 
<ximptete for the S-body case, as eq. (6.3.2) clearly shows. This is due to the truncation of Hilbert 
impied in the ansatz of 3D support to the pairwise BSE kernel K which, while facilitating a 
4D to 3D BSE reduction without extra charge, does not have the complete information to permit 
the rewfse transition (3D to 4D) without additional assumptions. To fill up this gap in this 



3D-4D B-S Amplitudes 


847 


“inverse” mathematical problem, we look for a suitable ansatz for Gz^ on the RHS of (6.3.2) 
in terms of known quantities, so that the reconstructed 4D G-function satisfies the 3D equation 
(6.2.12) exactly, as a check-point. We therefore seek a structure of the form 

^ ^iPz^Pz) (6.3.3) 

where the unknown function F must involve only the momentum of the spectator quark ^3. A 
part of the rjz^rjz' dependence has been absorbed in the G function on the right, so as to satisfy 
the requirements of Ss-symmetry for this 3D quantity [53]. 

As to the remaining factor F, it is necessary to choose its form in a careful manner so as to 
conform to the conservation of 4-momentum for the free propagation of the spectator between 
two neighbouring vertices, consistently with the symmetry between pz and pz, A possible choice 
consistent with these conditions is: 


PiPSrPs') = CsAs - 1 / 3 ') (6.3.4) 

Here represents the “free” propagation of quark between successive vertices, while C 3 
represents some residual effects which may at most depend on the 3-momentum pz, but must 
satisfy the main constraint that the 3D BSE, (6.2.12), be explicitly satisfied. 

To check the self-consistency of the ansatz (6.3.4), integrate both sides of (6.3.2) w.r.t. dszdszdtzdtz 
to recover the 3D Ss-invariant G-function on the left hand side. Next, in the first form on the 
right hand side, integrate w.r.t. dszdsz on the G-function which alone involves these variables. 
This yields the quantity Gz^- At this stage, employ the ansatz (6.3.4) to integrate over dtzdts^ 
Consistency with the 3D BSE, eq.(6.2.12), now demands 

Gs y* y' di'zdi/zAz~^S{i'z ~ I'z) = l;{sincedt = dv) (6.3.5) 

The ID integration w.r.t. dvz may be evaluated as a contour integral over the propagator , 
which gives the pole at vz = (see below for its definition). Evaluating the residue then gives 

Cz = i7t/{MoJz)] djl = TUq'^ -j- (6.3.6) 

which will reproduce the 3D BSE, eq.(6.2.12), exactlyl Substitution of (6.3.4) in the second form 
of (6.3.2) finally gives the desired 3-body generalization of (6.1.12) in the form 

= Y^D{qi2)AipA2FD{q[2)^iF^^2FG{i3T]z]iWz)[^ZF/{Miru)z)] (6.3.7) 

123 

where for each index, Ap = — iA~^ is the Feynman propagator. To find the effect of the ansatz 
(6.3.4) on the 4D BS wave function $(^7y;P), we do a spectral reduction like (6.2.13) for the 4D 
Green’s function G on the LHS of (6.3.2). Equating the residues on both sides gives the desired 
4D-3D connection between $ and 


H^rj;P) = Y^Diqi 2 )Ai-^A 2 -^m;P) x 

123 


j 5{i/z -Cjz/M) 

MtJsAs 


(6.3.8) 


defines the 4D wave fn in terms of piecewise vertex fas 1^, as 


^(PlP2Ps) = 


V1+V2+V3 

A1A2A3 


(6.3.9) 


Prom (6.3.8-9), we infer the haryon-qqq vertex function Fs corresponding to the ‘last’ interaction 
in the 12-pair as 

V 3 = D{^ 2 <P(i,'n) X y^2A3^(i/32M2 -wf) (6.3.10) 

and so on cyclically. (The argument of the J-fiinction inside the radical for V 3 simplifies to j^-t-m^). 
This expression had been obtained earlier from intuitive considerations [54b]. 



848 


A N Mitra 


To account for the appearance of the ID 5-&i under radical in (6.3.10), it is explained elsewhere 
[53] that it has nothing to do with connectedness [58] as such, but merely reflects a ‘dimensional 
mismatch’ due to the 3D nature of the pairwise kernel K [24] imbedded in a 4D Hilbert space. (For 
a physical explanation, see [53]). A further self-consistency check on (6.3.10), is found by taking the 
limi t, of a point interaction, which amounts to setting K = Constant, when the radical (expectedly) 
disappears, and gives a Lorentz-invariant result [53], in agreement with the so-called NJL-Faddeev 
(contact) model [59] for 3-particle scattering. For the fermion qqq case with pairwise gluonic 
interactions, the details may be found in [60], wherein the strength of the ‘color’ qq interaction 
[29] is half of that of qq [28]. For brevity, we skip the MYTP [26] derivation of the 4D qqq vertex 
function under Cov LF [37] conditions, which parallels that for the 2-body case [Sect.5], except for 
the remark that the old-fashioned LF/NP treatment [38] gives the same results as the more formal 
Cov LF treatment in Sect.5, so that a similar Cov LF form for qqq dynamics should be expected 
[38], with D(g) -+ Dn{q), etc in (6.3.10). 


7 Triangle Loops Under MYTP On Cov LF/NP 

In this Section, we shall illustrate the MYTP techniques on the covariant light-front to bring out 
the main feature, viz., structure of the triangle loop integrals free from the anomalies of time-like 
momenta in the product of gaussian vertex functions, such as complexities in the pion form factor 
[36] (see Sects. 1 and 5). To that end, we shall mainly consider the mathematical structure of the 
P-meson form factor, followed by a brief stetch of the structure of 3-hadron form factors, in the 
next few subsections, leaving routine calculational details to [37,32]. 

7.1 Pion Form Factor by Cov LF/NP Method 



Figure 1: Triangle loop for e.m. vertex 


Using fig.l above, and an identical one with 1 2, (c.f., figs. la,lb of [34b]), the 

amplitude for the h—^h'+'y transition is given by [34b] 

2P,F(*2) = 4{2^)^iV„(P)iV„(P')mi / d4ra)gn(g)0(0Dn(g')0(g') ^ ^ 3]; 

4rj ^ = rr[75(mi - - i'r ■p'i)'y5{m2 -f i 7 .p 2 )]; Aj = m| -l-pf ; 


Feynman 


(7.1) 


(7.2) 



3D-4D B-S Amplitudes 


849 


Pi.2 =mi,2P±9; p[^ 2 =mi, 2 P' ±q' P 2 =p'^] P - P' = pi - p[ = t, 2P = P + P'. (7.3) 
After evaluating the traees and simplifying, becomes 

= (P2^-P^) [5m^ As] - k^p2^/2 + (Ai - A] ) A^/4 (7.4) 

The last term in (7.4) is non-gauge invariant, but it does not survive the integration in (7.1), since 
the coefficient of viz., Ai — is antisymmetric in pi and p[, while the rest of the integrand in 
(7.1) is S3rmmetric in these two variables. Next, to bring out the proportionality of the integral (7.1) 
to it_is necessary to resolve p 2 into the mutually perpendicular components ^2x5 {p 2 -k[k'^)k 
and (p2*P/P^)P, of which the first two will again not survive the integration, the first due to the 
angular integration, and the second due to the antisymmetry of A; = pi - p[ in pi and pi, just as 
in the last term of (7.4). The third term is explicitly proportional to P^, and is of course gauge 
invariant since P.k = 0. (This fact had been anticipated while writing the LHS of (7.4)). Now 
with the help of the results 

P 2 .P = -m 2 M^ - Ai/4 - Ai/4; 2m2 = 1 - (mf - ml)IM^\ ~ (7.5) 

it is a simple matter to integrate (7.1), on the lines of Sec.5, noting that terms proportional to 
Ai A2 and A^ A2 will give zero, while the non- vanishing terms will get contributions only from the 
residues of the A2“pole, eq.(5.15). Before collecting the various pieces, note that the 3D gaussian 
wave functions (l>,(j)'^ as well as the 3D denominator functions do not depend on the time- 

like components p2nj so that no further pole contributions accrue from these sources. (It is this 
problem of time-like components of the internal 4-momenta inside the gaussian <;z^functions under 
the CIA approach [24], that had plagued a earlier CIA study of triangle diagrams [36]). To proceed 
further, it is now convenient to define the quantity q.n = p2.n — m2P.n to simplify the (jy and 
functions. To that end define the symbols: 

(^5 ^0 = ? i Z 2 = q.n/P.n; k = k.nfP.n; (9k, Vk) = 1 d: ^^/4 (7.6) 

and note the following results of pole integration w.r.t. p2n [38]: 

I rfp2n;^[l/Ai;l/Ai;l/(AiAi)] = [l/i}„;l/D;;2p2.n/(I?„D;)] (7.7) 

Details of further calculation of the form factor axe given in [37]. An essential result is the normal- 
izer Nn{P) of the hadron, obtained by setting = 0, and demanding that P(0) = 1. The reduced 
(Lorentz-invariant) normalizer Nh = Nn{P)P-n/M is given by [32,37]: 

= 2M{2ivf j + 5m^ I M^){f - X/4M^) + 2mifh2{M^ - 6m^)] (7.8) 

where the internal momentum q = {qx, Afzj) is formally a 3-vector, in conformity with the ‘angular 
condition’ [21]. The corresponding expression for the form factor is [32, 37]: 

F(P) = 2MiV|(27r)3exp[-(Mm2 Vi3)V40fc](7ry32)3/2 J^AiGik) + [1=^2] (7.9) 

vh 


where G(k) is a function of k; see eqs.(A.12-13) of [32]. 

7.2 ‘Lorentz Completion’ for F(k^) 

The expression (7.9) for F(k^) stiU depends on the null-plane orientation via the dimensionless 
quantity k = k.n/P.n which while having simple Lorentz transformation properties, is nevertheless 
not Lorentz invariant by itself. To make it explicitly Lorentz invariant, we shall employ a simple 
method of ‘Lorentz completion’ which is merely an extension of the ‘collinearity trick’ empolyed 
at the quark level, viz., Px-qx — Oj see eq.(5.11). Note that this collinearity ansatz has already 



850 


A N Mitra 


become reduntant at the level of the Normalizer Nh, eq.(7.8), which owes its Lorentz invariance 
to the integrating out of the null-plane dependent quantity Z 2 in (7.8). This is of course because 
Nfi depends only on one 4 -momentum (that of a single hadron), so that the collinearity assump- 
tion is exactly valid. However the form factor F{k^) depends on two independent 4-momenta 
P,P', for which the collinearity assumption is non-trivial, since the existence of the perpendicular 
components cannot be wished away! Actually the quark-level assumption P i .qx — 0 has, so to 
say, got transferred, via the g-integration in eq.(7.9), to the hadron level, as evidenced from the 
jk-dependence of F{k^); therefore an obvious logical inference is to suppose this fc-dependence to 
be the result of the collinearity ansatz Px.Pj_ = 0 at the hadron level. Now, under the collinearity 
condition, one has 


P.P' = P_^.P'j_ + P.nP'.h + P'.nP.h=P.nP^ + P'.nPn; P.fi = Pn- (7.10) 


Therefore ‘Lorentz completion’(the opposite of the collinearity ansatz) merely amounts to reversing 
the direction of the above equation by supplying the (zero term) Pj..Pj_ to a 3 -scalar product to 
render it a 4-scalar! Indeed the process is quite unique for 3-point functions such as the form factor 
under study, although for more involved cases (e.g., 4-point functions), further assumptions may 
be needed. 

In the present case, the prescription of Lorentz completion is relatively simple, being already 
contained in eq.(7.10). Thus since P,P'= P ± A:/2, a simple application of (7.10) gives 



P-.nP„ = -M^ - fcV4; 
4^.-4 = 4-4,. 


(7.11) 


This simple prescription for k automatically ensures the 4D (Lorentz) invariance of F{k^) at the 
hadron level. (For comparison with alternative methods [22b], see [37]). 


7.3 QED Gauge Corrections to F{k'^) 

While the ‘kinematic’ gauge invariance of F(fc^) has already been ensured in Sec. 7.1 above, there 
are additional contributions to the triangle loops - figs. la and lb of [34b] - obtained by inserting the 
photon lines at each of the two vertex blobs instead of on the quark lines themselves. These terms 
arise firom the demands of QED gauge invariance, as pointed out by Kisslinger and Li [61] in the 
context of two-point functions, and are simulated by inserting exponential phase integrals with the 
e.m. currents. However, this method (which works ideally for point interactions) is not amenable 
to extended (momentum-dependent) vertex functions, and an alternative strategy is needed, as 
described below. 

The way to an effective QED gauge invariance lies in the simple-minded substitution pi - eiA{xi ) 
for each 4-momentum pi (in a mixed p,x representation) occurring in the structure of the vertex 
function. This amounts to replacing each occurring in T{q) = D{q)(j){q), by where 

eg = m 2 ei - mi 62 , and keeping only first order terms in after due expansion. Now the first 
order correction to q^ is —eqq.A — CgA.g, whidi simplifies on substitution from eq.( 7 . 11 ) to 

-2eqq.A = -2eqAf^[qfj, ~ g.nn^ -h P.nq.nn^( P,n] ( 7 . 12 ) 

The net result is a first order correction to r(g) of amount eqj{q).A where 

i(4)^ = ~4M>g^^(g)(l ~ (f - A/4M^)/2/32) (7.13) 

The contribution to the P-m^n form factor from this hadron-quark-photon vertex (4-point) now 
^ves the QED gauge correction to the triangle loops, in the form of a similar function Fi{P) 
which works out as [ 37 ]: 

Pi{P) = i{2sfN%e,m-iMl j d*q{M^ - q- + {l 2} (7.14) 



3D-4D B-S Amplitudes 


851 


where M> = sup{M;mi + m 2 } [37], and the common factor 2 P^ has been extracted as for F(k^) 
in (7.1). Note that eg is antisymmetric in ‘ 1 ’ and ‘ 2 ’, signifying a change of sign when {1 => 2 } is 
added on the RHS. The pole integration of Fi(k^} now yields a result like (7.9) for F(Jb^); see [37] 
for details. 

The large and small k^ limits of F(k^) and Fi(k^) are on expected lines, and we summarise 
only the final results for completeness [37]. For large k^, the functions F(k^) and Fi(k^) both yield 
the correct asymptotic form Cfk^, where C = 0.35GeF^, to be compared with the experimental 
value [62a] 0.50 db 0.10, and the (perturbative) QCD value [63] STragf^ = 0.296. 

For low on the other hand, an expansion of P, Pi in powers of k^ yields a value of the charge 
radius R according to < > = —Vk'^{F{k‘^) d-Pi(/:^)) in the P = 0 limit. Of the two functions, 

only F{k‘^) contributes in this limit [37]. The numerical values for the kaon and pion radii, vis-a-vis 
experiment [62b], are 

Rk = 0.63/m us(0.53/m); jRtj. = 0.661/m us(0.656/m). (7.15) 

7 A Three-Hadron Couplings Via Triangle-Loops 



Figure 2 : 3-hadron coupling 


For a large class of hadronic processes like H and H the quark triangle loop 

[64] represents the lowest order “tree” diagram for their evaluation. Criss-cross gluonic exchanges 
inside the triangle-loop are not important for this description in which the hadron-quark vertices, 
as well as the quark propagators are both non-perturbative, and thus take up a lion’s share of 
non- perturbative effects. This is somewhat similar to the ‘dynamical perturbation theory’ of 
Pagels-Stokar [65], in which criss-cross diagrams are neglected. 

We now indicate in the barest outline, the structure of the 3 -hadron loop integral for the most 
general case of unequal mass kinematics mi / m 2 m 3 , while referring for notational details to 
ref.[32]. The full structure of the 3 -hadron amplitude may be written down from fig .2 above (c.l, 
fig.l of [64]), just as for the e.m. form factor (7.1): 


( 7 . 16 ) 



852 


A N Mitra 


exhibiting cyclic symmetry, where the normalized vertex function Ti in CNPA [41] is given in an 
obvious notation by eq.(5-lS) ^ 




Tiiqi) = ^ ^ 


(7.17) 


where the "reduced’ denominator function Di = DiJ^MilPi^ and the (invariant) normalizer Nijj is 
Ni, The color factor and the effect of reversing the loop direction are given by 2/\/3, etc [38,64]. 
the overall BS normalizer [38]. 

To evaluate (7.16), first write the cyclically invariant measure: 


(Ppi = (fj_^d{xf)Mfdyi; Xi =pi+/Pi+; yi -pi-IPi- (7.18) 

The cyclic invariance of [7.18] ensures that it is enough to take any index, say 2, and first do the 
pole integration 'w.r.t. the yi variable w^hich has a pole at 2/2 = ^2=^2±/(^^2^2)- The process 
can be repeated, by turn, over all the indices and the results added. Note that the ^functions do 
not include the time-like yi variables under CNPA [37], so that the residues from the poles arise 
from onl}' the propagators. The crucial thing to note is that the denominator functions Di and 
Dz sitting at the opposite ends of the P2-Iin^ (c-f* Fig-1 of [64]) will cancel out the residues from 
the complementary (inverse) propagators A3 and Ai respectively. Indeed by substituting the pole 
value 1/2 = 65 ^1,3^ the corresponding residues in an obvious notation work out as [32]: 

Ai;2 ~ 312^23-^73 2/i21-^73j A3;2 = — ~ X2n2lAI^ — 2^23^7]^ (7.19) 


It is then found, with a short calculation [32], that 

= 2^3X2^23; = 2MiX2n2i (7.20) 

^ 1;2 ^ 3;2 

which shows the precise cancellation mechanism between the Dj-functions and the residues of the 
propagators A,- at the A2 pole. This mechanism thus eliminates [24, 64] the (overlapping) Landau- 
Cutkowsky poles that would otherwise have caused free propagation of quarks in the loops. The 
same procedure is then repeated cyclically for the other two terms arising from the A34 poles. 
Collecting the factors, the result of all the 3 contributions is compactly expressible as [64, 32]: 


^(3J?) = 8y^Si23 J I 


M2n23 7121 ^^dX2dC2 ^2 [^■^]27?2 ($2 )ni23 Afi 


(7.21) 


where the limits of integration for both variables axe — inf < (^2,2:2) < +inf, since these are 
governed, not by the on-shell dynamics of standard LF methods [22-23], but by off-shell 3D-4D 
BSE. The difference from [64] (under CIA [24]) aris^ from using CNPA [37] which has ensured 
that the (gaussian) functions (j>i on the RHS of (7.21) are now free from time-like momenta (unlike 
in CIA [24,64]). 

Eq.(7.21) is the central result of this exercise. Its general nature stems from the use of unequal 
ma^ kinematics at both the quark and hadron levels, which greatly enhances its applicability 
to a wide da^ of problems which involve 3-hadron couplings, either as complete processes by 
thaimlves (such as in decay processes) or as parts of bigger diagrams in which 3-hadron couplings 
mrm as basic building blocks. What makes the formula particularly useful for general applications 
is its atoidt Lomntz invariance which has been achieved through the simple method of ‘Lorentz 
Completion’ on the Ines of ^t.7.2 for the e.m, form factor of P-mesons; for more details, see [32]. 

As r^ards two- quark loops, such as for SU{2) mass splittings of P-mesons [33b], and the 
mixing of p and w off-shell propagators [33a], the distinction between CIA [24] and CNPA [37] 
is sharp, (no time-like momentum problems in the overlap integrals). The same holds for 
one-quark loops, e.g., in the problem of vacuum condensates. For a review of these processes, as 
wei m fa: other reference, ^ [32]. 



3D-4D B-S Amplitudes 


853 


8 Retrospect And Conclusions 

In keeping with our objectives {A)-(C) defined at the outset, Sects. 1-2 have attempted a panoramic 
view of several standard approaches to 3D BSE reductions [6-9] under the general Bethe Second 
Principle Philosophy of effective quark-pair interaction. In particular, the relative unfamiliarity 
with MYTP [26] in the literature, especially its novel feature of effecting an exact 3D reductions 
of the qq and qqq BSE’s, as well as exact reconstructions of the 4D amplitudes in closely parallel 
fashions, have necessitated the introduction of some background techniques under one roof. To 
that end Sect.3 collects a general derivation of the equations of motion in interlinked BSE-SDE 
form from an input 4-fermion Lagrangian for ‘current’ quarks, under MYTP conditions [25], much 
like the derivation of similar equations [10-12] without this constraint. And in preparation for the 
derivation of MYTP-governed equations in a Covariant LF/NP framework, Sect.4 collects some 
essential background material, especially the ‘angular condition’ [21], under Cov LF/NP conditions. 
With this background, Sect.5 outlines a comparative derivation of the MYTP-controlled 3D-4D 
interlinkage of qq Bethe-Salpeter amplitudes under both CIA [24] and Cov LF [37] conditions. 
And in keeping with the parallelism between the 2- and 3-quark treatments, Sect.6 gives a similar 
derivation for the qqq sustem. Now this twin facility which does not seem to exist in the other 
3D approaches [6-9,22-23], gives rise to a natural two-tier description [38], the 3D BSE form being 
appropriate for making contact with the hadron spectra [13], while the reconstructed 4D BSE yields 
a vertex function which allows the use of standard Feynman diagrams for 4D loop integrals. To 
appreciate why two distinct forms of MYTP [26] have been developed on parallel lines: Covariant 
Instantaneity Ansatz [24] (CIA), and Covariant Light-front [37] (Cov LF/NP), the advantage of 
the latter over the former in producing well-defined triangle loop integrals has been demonstrated 
in Sect. 7 through the examples of pion form factor and more general 3-hadron couplings, except 
for a (less serious) problem of dependence on the ‘null-plane orientation’ which can be handled 
through a simple device of ‘Lorentz completion’ and yields an explicitly Lorentz-invariant structure. 
Similarly the baryon-quark vertex function (6.3.10) is a key ingredient of MYTP, for the calculation 
of baryonic loop integrals, of which the baryon self-energy [60] is the simplest example, but the 
calculational details [60] have been omitted for brivity. 

In keeping with its mathematical (formalistic) emphasis of this Article, we have refrained from 
discussing the phenomenological applications, but it has been shown that the canvas of MYTP 
[26] is broad enough to accommodate additional physical principles. In particular, the physical 
basis chosen for detailed presentation, has been a QCD motivated 4-fermion Lagrangian (with 
an effective gluonic propagator) which generates the BSE-SDE structure by breaking its chiral 
symmetry dynamically (DBxS) [11-12], formulated within an MYTP [26] framework. 

Clearly, the MYTP is a very powerful Principle which helps organize a whole spectrum of 
phenomena under asingle umbrella. For its applications, only a few examples have been indicated, 
but its potential warrants many more. More importantly, the interlinked 3D-4D structure of BS 
dynamics under MYTP [26] premises, gives it access to a whole range of physical phenomena, 
from spectroscopy to diverse types of loop integrals. The emphasis on the spectroscopy sector as 
an integral part of quark physics was first given by Feynman et al [39]. 


References 

[1] S.Tomonaga, Prog.Tlieo.Phys.1,27 (1946); J.Schwinger, Phys.Rev.73, 416 (1948); 
R.P.Feynman, Phys.Rev.76, 749, 769 (1949); FJ-Dyson, Phys.Rev.75, 486, 1736 (1949). 

[2] LTamm, J.Phys.(U.S.S.R.), 9,449 (1945); S.M.Dancoff, Phys.Rev.bf 78, 382 (1950). 

[3] H.A.Bethe and F.de Hoffinann, Mesons and Fields, II, Row, Peterson And Co, New York, 
1955; p 199. 

[4] C.Itzykson and J.-B.Zuber, Quantum Field Theory, McGraw-HiU Inc, New York, 1980; Chap- 
ter 10. 



854 


A N Mitra 


[5] E.E.Salpeter and H.A.Bethe, Phys.Rev.84, 1232 (1951); M.Gell-Mann and F.E.Low, loc.cit, 
350. 

[6] (a) M. Levy, Phys.Rev.88, 72 (1952); (b) E.E.Salpeter, ibid 87, 328 (1952). 

[7] (a) A. Logunov and A.N.Tavkhelidze, Nuovo Cimento 29, 380 (1963); (b) V.R.GarsevanishvUi, 
et al, Phys.Lett.29B, 191 (1968); (c) R.N.Faustov, Ann. Phys.(N.Y.)78, 176 (1973). (d) 
LTodorov, Phys.Rev.D3, 2351 (1971) 

[8] V. Kadychevsky, Nucl.Phys.B6, 125 (1968); 

[9] R. Blankenbecler and R. Sugar, Phys.Rev.142, 105 (1966). 

[10] Review: C.D.Roberts et al, Prog.Part.Nucl.Phys.33, 471 (1994). 

[11] Y. Nambu and G. Jona-Lasino, Phys.Rev.122, 345 (1961). 

[12] (a) S.L. Adler and A.C. Davies, Nucl.Phys.B244, 469 (1984); (b) A.Le Yaouanc et al, 
Phys.Rev.D29, 1233 (1984); D31, 317 (1985); (c) R.Delbourgo and M.D.Scadron, J.Phys.G 
5, 1621 (1979). 

[13] Particle Data Group, Phys.Rev.D54, July 1-Paxt I (1996). 

[14] E.g., R.F.Meyer, Nucl.Phys.B71, 226 (1974). 

[15] P.A.M.Dirac, Can.J.Math.2, 129 (1950). 

[16] A.Komar, Phys.Rev.D18, 1887 (1978); L.P.Horowitz and F.Rohrlich, Phys.Rev.D24, 1528 
(1981); H.Crater and Van Alstine, Phys.Rev.D30, 2585 (1984); H.Sazdjan, Phys.Lett.156B, 
381 (1985). 

[17] (a) L.Lusanna; This Book; (b) P.P.Srivastava: This Book. 

[18] S. Weinberg, Phys.Rev.150, 1313 (1966) 

[19] S. Weinberg, Phys Rev.l33B, 232 (1964). 

[20] P.A.M.Dirac, Rev.Mod.Phys.21, 392 (1949). 

[21] H.Leutwyler and J.Stem, Ann.Phys.(N.Y.)112, 94 (1978). 

[22] (a) V.A. Karmanov, Nucl.Phys.B166,378 (1980). (b) Review: J.Carbonell et al, Phys.Rep. 
(1998); to appear. 

[23] R.J.Perry,A.Harindranath and K.Wilson, Phys.Rev.Lett,65, 2959 (1990). 

[24] A.N. Mitra and S. Bhatnagar, Int.J.Mod.Phys.A7, 121 (1992). 

[25] (a) Yu. L. Kalinowski et al, Phys.Lett.B231, 288 (1989); (b) Yu.L. Kalinovsky et al, Few-Body 
Syst.8, (1991); (c) V.N.Pervushin et al, Fortschritte der Physik38, N4 (1990). 

[26] (a) M.A. Markov, Sov.J.Phys.3, 452 (1940); (b) H. Yukawa, Phys.Rev.77, 219 (1950); (c) J. 
Lukierski and M. Odewicz, Phys.Lett.B69, 339 (1977). 

[27] (a) J.M. Comwall et al, Phys.Rev.D10, 2428 (1974); (b) H. Kleinert, Phys.Lett.B26, 429 
(1976); (c) D.W. McKay et al, Phys.Rev.D37, 195 (1988). 

[28] A.N. Mitra and B.M. Sodermark, Int.J.Mod.Phys.A9, 915 (1994). 

[29] a) A.Mittal and A.N.Mitra, Phys.Rev.Lett.57, 290 (1986); b) K.K. Gupta et al, 
Phys.Rev.D42, 1604 (1990); c) A. Sharma et al, Phys.Rev.D50, 454 (1994). 

[30] Reprint Coll: W.Buchmueller (ed), Quarkonia, North-Holland, 1992. 



3D-4D B-S Amplitudes 


855 


[31] (a) M.A.Shifman et al, Nucl.Phys.B147, 385 (1979); (b) V.L.Chernyak and A.R.Zitnitsky, 
Phys.Rep.112C, 173 (1984); (c) B.L.Ioffe and A.V.Smigla, Nucl.Phys.B232, 109 (1984). 

[32] Review: A.N.Mitra, Proc.Ind.Natl.Sci.Acad.A; May-June 1999-in press. 

[33] A.N.Mitra and K.-C.Yang, Phus.Rev.C51, 3404 (1995); A.N.Mitra, Int J Mod Phys All, 
5245 (1996). 

[34] A.N.Mitra, A.Pagnamenta and N.N.Singh, Phys.Rev.Lett.59, 2408 (1987); 

N.N.Singh and A.N.Mitra, Phys.Rev.38, 1454 (1988). 

[35] C.R.Ji and S.Cotanch, Phys.Rev.Lett.64, 1484 (1990). 

[36] S.R.Chaudhury et al, Delhi Univ. Preprint (1991)-Unpublished; I.Santhanam et al, 
Intl.J.Mod.Phys.E2, 219 (1993). 

[37] A.N.Mitra, LANL hep-ph/9812404; Phys.Lett.B463, 293 (1999) 

[38] Review: S.Chakirabarty et al, Prog.Part.Nucl.Phys.22, 43-180 (1989). 

[39] R.P.Feynman, M.Kislinger and F.Ravndal, Phys.Rev.D3, 2706 (1971). 

[40] H.D. Politzer, Nucl.Phys.B117, 397 (1976). 

[41] B. A. Arbuzov et al, Mod.Phys.Lett.A5, 1441 (1990). 

[42] (a) GSI-ORANGE:T.Cowan et al, Phys.Rev.Lett.56, 444 (1986); H.Bokemeyer et al, report 
GSI-89-49, 1989; 

(b) GSI-EPOS:H.Tsertos et al, Z.Phys.A326, 2235 (1987); W.Koenig et al Phys.Lett.B218, 
12 (1989). 

[43] J.Von Neumann and E.P.Wigner, Z.Phys.30, 365 (1929). 

[44] (a) R.H.Dalitz, Proc. XIII Inti Conf. on HEP, Berkeley 1966; (b) O.W.Greenberg, 
Phys.Rev.Lett.l3, 564 (1964); (c) A.N.Mitra and R.Majumdar, Phys.Rev.150, 1194 (1966). 

[45] (a) C.Becchi and G.Morpurgo, Phys.Rev.149, 1284 (1966); (b) A.N.Mitra and M.H.Ross, 
Phys.Rev.158, 1630 (1967). 

[46] (a) M.Bohm, H.Joos and M.Krammer, Nucl.Phys.B50, 397 (1973). (b) Y.S.Kim and M.Noz, 
Phys.Rev.D8, 3521 (1973) 

[47] (a) J.D.Bjorken, Phys.Rev.179, 1547 (1969); (b) R.P.Feynman, Phys.Rev.Lett.23, 1415 
(1969). 

[48] L.Susskind, Phys.Rev.165, 2537 (1968). 

[49] J.Kogut and D.Soper, Phys.Rev.Dl, 2901 (1970); J.D.Bjorken et al, Phys.Rev.D3, 1382 
(1971). 

[50] (a) K.Pujimura, T.Kobayasi and M.Namiki, Prog.Theo.Phys.44, 193 (1970); (b) A.Licht and 
A.Pagnamenta, Phys.Rev.D2, 1150 (1970); (c) A.Le Yaouanc et al, Phys.Rev.D12, 2137 
(1975); (d) A.N.Mitra and LKumari, Phys.Rev.D15, 261 (1977). 

[51] S.Brodsky and G.Farrax, Phys.Rev.Dll, 1309 (1975); V.A.Matveev et al, Lett.Nuovo. Cim.7, 
712 (1973). 

[52] A.J.McFarlane, Rev.Mod.Phys.34, 41 (1981). 

[53] A.N.Mitra, LANL hep-th/9803062; Intl.J.Mod.Phys.A14, 4781 (1999). 



m 


A N Mitra 


[54| (a) A.N.Mltra, Zeits.f.Phys.CS, 25 (1981); (b) A.N.Mitra and LSanthanam, Few-Body 
Syst.l2, 41 (1992); (c) R. Barbieri and E.Rimiddi, Nucl.Phys.B141, 413 (1978); (d) 
G.P.Lepage, SLAC-Preprint no.212 (1978). 

|55] (a) M.V’erde, Handbuch der Physik 39, 170 (1957); (b) A.N.Mitra and M.H.Ross, 
Phy8.Rev.158, 1670 (1967); (c) Yu.A.Simonov, Sov.J.Nuci.phys.3, 461 (1966); (d) G.Karl 
and E.Obryk, NucLPhys.BS, 609 (1968). 

[56] D.B.Liochtenberg, Phys.Rev.178, 2197 (1969); A.N.Mitra and Anju Sharma, 

Fortschr.Phys.45, 411-4^1 (1997). 

[57] (a) H.Kramer and M.Moshmsky, NucLPhys.82, 241 (1966); (b) A.N.Mitra et al, Few-Body 
Syst.l9, 1 (1995); (c) J.Bijtebier, Nuovo Cimento 81A, 423 (1985). 

[58] (a) L.D.Faddeev, Sov.Phys.JETP 12, 1014 (1961); (b) A.N.Mitra, Nucl.Phys.32, 529 (1962); 
(c) C.Lovelace, Phys.Rev.B135, 1225 (1964); (d) S. Weinberg, Phys.Rev.B133, 232 (1964). 

[59] S.Huang and J.Tjon, Phys.Rev.C49, 1702 (1994); N.Ishii et al, Austr J Phys 50, 123 (1997); 
W.Bentz, J. Korean Phys. Soc.29 Suppl, 5352 (1996). 

[60] A.Sharma and A.N.Mitra, LANL hep-ph/9906288; Intl.J.Mod.Phys.A14, 4589 (1999). 

[61] L.S.Kisslinger and Z.Li, Phys.Rev.Lett.74, 2168 (1995). 

[62] (a) E.B.Daiey et al, Phys.Rev.Lett.45, 232 (1980); (b) C.Bebek et al, Phys.Rev.D17, 1793 
(1978). 

[63] G.Farrar and D. Jackson, Phys.Rev.Lett.43, 246 (1979). 

[64] W.Y.P.Hwang and A.N.Mitra, Few-Body Syst.l5, 1 (1993) 

[65] H.Pagels and S.Stokar, Phys.Rev.D20, 2947 (1979). 



32. The harmonic oscillator in quantum theory: 
A powerful bridge in physics 

Marcos Moshinsky 
Institute de Fisica-UNAM. 

Apartado Postal 20-364, 01000 Mexico, D. F. Mexico 


Abstract 

An overview is given of the extensive applicational potential of the Harmonic Oscillator 
framework in tackling diverse problems in in quantum theory, taking as basis the Author’s 
recent book with Yu F. Smirnov) “The harmonic oscillator in modern physics” (Harwood 
Academic Publishers), which provides the necessary background for this Article. 


1. Introduction 

The pivotal role of the Harmonic Oscillator (HO) as a basic tool for the development of Physics, 
especially since the birth of quantum mechanics, hardly needs any elaboration. As the very first 
example of application of quantization rules, successively through spectra, wave functions, symme- 
tries, and so on, it has had numerous applications not only in direct calculations, but also as a model 
for increasing our understanding of more complex problems. Indeed it pervades all dimensions of 
Field Theory in terms of both formulation and methodology. Its basic logic stems from the linearity 
of the field equations, or equivalently the quadratic structure of the Lagrangian/Hamiltonian, in 
the concerned field variable (^, 0) which, at the level of first quantization, corresponds precisely to 
the quantum mechanical wave function, satisfying a (linear) Schroedinger-like equation. Indeed the 
language of creation/ annihilation operators in QFT is based on the very structure of the harmonic 
oscillator which contains the co-ordinate/momentum (equivalently the field/conjugate variables in 
QFT) in equal proportions. For non-linear field equations too, the HO serves as a useful zero order 
basis, just as any quantum mechanical potential can be expanded in an HO basis. Finally, the 
HO forms the basis for Path Integral formulations, at the quantum mechanical as well as the field 
theoretic levels. 

The present Article aims to draw attention to the vast applicational potential of this unique 
armour in the arsenal of physics through an illustrative list of topics selected from a recent book 
entitled Harmonic Oscillator In Modem Physics, (with Yu A Smirnov), which deals comprehen- 
sively with the subject. To that end, Sect. 2 is a table of contents of the various topics addressed 
in the book, (which is listed sequentially as ref. [24] in this Article), while Sect. 3 is a brief sketch 
of the subject-matter actually covered therein, without however going into their details. In Sect.4 
some recent work of the author and collaborators on relativistic particles of arbitrary spin in an 
harmonic oscillator potential, is summarized as a further illustration of the applicational potential 
of harmonic oscillator techniques. 


Member of El Colegio Nacional; Email: moshi@fenix.ifisicacu.unam.mx 



858 


Marcos Moshiasky 


2. Table of Contents of : 

The Harmonic Oscillator in Modern Physics 

I. THE ONE-BODY PROBLEM 

1. The radial wave function of the harmonic oscillator. 

2. The matrix elements of with respect to harmonic-osciilator states. The 
coefficients 

3. The one-electron atomic problem. \’^ariational analysis of the ground state of the hydrogen 
atom in terms of harmonic- oscillator states. 

4. The one-electron molecular problem. The ground state of H J in terms of harmonic-oscillator 
states, 

5. Scattering of electrons by hydrogen atoms and the form factor of the electron charge distri- 
bution in the ground state. 

6. Theoretical form factor of the hydrogen atom using harmonic- oscillator states. 

7. Direct determination of the ground state through a least- squares approach to the form factor. 
The Pseudo-Hartree-Fock (PHF) approximation. 

8. The one-body harmonic-oscillator states expressed in terms of creation operators. 

9. Normalization coefficients of the harmonic-oscillator states. 

II. THE TWO-BODY PROBLEM 


10. Transformation brackets for two-particle harmonic-oscillator states. 


11. Applications of the transformation brackets to atomic problems. The helium atom. 


12. AppMcatioiisoftbe transformation brackets to molecular problems. TheH 2 andH^ molecules. 


13. Matrix elements in j - j coupling. 


14. Application to two-particle problems in the 2s-ld nuclear shell. The ^^0 and ^®F nuclei. 


15. IVansformation brackets for arbitrary angles in terms of standard transformation brackets. 



Harmonic Oscillator in Quantum Theory 


859 


III. THE THREE-BODY PROBLEM 

16. Matrix elements of the Hamiltonian with respect to translationally invariant states. 

17. Translationally invariant three-particle states of definite permutational symmetry. 

18. The general three-body problem. Applications to the lithium atom. 

19. Form factors of nuclei. 

20. Structure and form factor of ^H and ^He . a) Binding energy of b) Form factors of ^H 
and ^He, 

IV. THE FOUR-BODY PROBLEM 

21. Harmonic-oscillator states in the symmetrical system of relative coordinates. 

22. Transformation brackets between the states in the symmetrical and Jacobi coordinate system. 

23. Form factor for a linear combination of harmonic-oscillator states. Application to the a 
particle. 

V. THE n-BODY PROBLEM IN THE HARTREE-FOCK APPROXIMATION 

24. How good is the Hartree-Fock approximation? A simple model. 

25. The set of algebraic equations and their self-consistent solution. 

26. Hartree-Fock approximation with harmonic-oscillator states. The case of closed shells. 

27. Applications to the beryllium atom and the ^^0 nucleus. 

VI. THE HARMONIC OSCILLATOR IN SCATTERING AND REACTION THEORY 

28. Scattering of a particle in a central potential. 

29. Solution of the equation of free motion in the harmonic oscillator basis. 

30. Calculation of phase shifts. 


31. Exactly soluble example. 



860 


Marcos Moshinsky 


32. Phase shifts and resonances in the harmonic oscillator representation 

a) s-scattering on a Gaussian potential. 

b) s-scattering on a <5-shell potential. 

c) Evidence of resonances in variational calculations in harmonic oscillator basis. 

VIL GROUP THEORY OF HARMONIC OSCILLATORS 

33. Group theory of the one difnensional harmonic oscillator. 

34. The Lie algebra of linear canonical transformations. 

35. The representation in quantum mechanics of the group of linear canonical transformations. 

36. Group theory of the n-dimensional harmonic oscillator. 

37. Lie algebra for systems of m oscillators of n dimensions. 

38. The group U(3) for systems of 1 and 2 particles, as well as for 3 but only in relative motion. 

a) States for the chain (38.1). 

b) States for the chain (38.3). 

39. Application of the U(3) states in the 2s-ld nuclear shell. 

\TII FOUR DIMENSIONAL HARMONIC OSCILLATOR AND THE COULOMB PROB- 
LEM 

40. Eigenvalues and eigenfunctions of the four dimensional harmonic oscillator. 

41. Kustannheimo-Stiefel transformation from the four dimensional oscillator to the Coulomb 
problem. 

42. Dynamic and invariance algebras of the two problems and their relations. 

DC. THE FIVE DIMENSIONAL OSCILLATOR AND NUCLEAR COLLECTIVE MOTIONS 

43. The quadmpole liquid drop model and its classical Hamiltonian for small deformations. 

44. The classical and quauitum Hamiltonians for quadrupole deformations in the frame of refer- 
moe fixed in the body. 

45. States of the five dimensional oscillator characterized by irreducible representations of the 
diam of grou|^ U(5) D 0(3). 

46. States associated with the chain U(5) D 0(5)D 0(3) in terms of traceless boson operators. 



Harmonic Oscillator in Quantum Theory 


861 


47. The expression of the states \klL]^ when L is even, in terms of the polynomials in the epd’s 
of creation operators 7}^ and of the a^- 

48. The polynomials Pkiiio^m) as function of the variables and the determination 

49. Operators for Hamiltonians and transition probabilities for the collective model of the nucleus 
and their matrix elements. 

50. Application to quadrupole collective motions in nuclei. 

a) Potential energy surfaces associated with F(am)- 

b) Reduction of the matrix elements of the kinetic energy type to those discussed in (45.15). 

c) Application to energy levels and transition probabilities in 

51. Extension of the analysis to odd angular momenta. 

X. THE SIX DIMENSIONAL OSCILLATOR AND THE INTERACTING BOSON MODEL 

52. The interacting boson model 

53. Chains of groups associated with the six dimensional oscillator. 

a) The chain U(6) D U(5) D 0(5) D 0(3). 

b) The chain U(6) D 0(6) D 0(5) D 0(3). 

c) The chain U(6) D U(3) D 0(3). 

54. Transformation brackets between the states in the 0(6) and U(5) chain of groups. 

55. Matrix elements of Qm and with respect to the BIR of a U(5) chain of groups. 

56. The general Hamiltonian of the interacting boson model. 

57. Applications of the interacting boson model to problems of nuclear structure. 

XL THE ONE BODY RELATIVISTIC OSCILLATOR 

58. The Klein-Gordon equation with an oscillator interaction. 

59. The Dirac oscillator 

• a) The symmetry Lie algebra of the Dirac oscillator. 

60. The spinorial relativistic particle in an electromagnetic field. 

XII. THE TWO-BODY RELATIVISTIC OSCILLATOR 

61. The system of two non-interacting spinorial particles and the relativistic cockroach nest. 



862 


Marcos Moshmsky 


62. The two-body system with a Dirac oscillator interaction, a) Solution of Eq. (62.4) when 
B = ^2- 

b) Solution of Eq. (62.4) w^hen B = /3i/3275i752- 

c) Poincare invariant form of the two particle system "Ruth a Dirac oscillator interaction. 

d) Symmetry algebras and superalgebras of the two-body system with a Dirac oscillator 
interaction. 

63. Mass spectra of the particle-antiparticle system with a Dirac oscillator interaction. 

a) The Dirac oscillator equation for an antiparticle 

b) Equation for the particle-antiparticle system with a Dirac oscillator interaction and its 
perturbative solution 

c) Square of the mass spectra of the particle-antiparticle system 

d) Comparison with meson spectra 

64. Radial equation for the particle-antiparticle system and a qualitative application to mesons, 

a) The radial wave equation. 

65. A relativistic two body problem with an interaction modulated by a constant tensor. 

66. Formulation of the oscillator problem through two independent, but constrained, relativistic 
equation. 

a) The case of two scalar particles. 

b) The case of a scalar and spinorial particle. 

c) The case of two spinorial particles. 

XIIL THE n-BODY RELATIVISTIC OSCILLATOR 


67. The non-relativistic problem for n-free scalar particles and its extension to a system with 
harmonic oscillator interactions. 


68. The system of three relativistic scalar particles with oscillator interactions. 

a) The spectra of the problem- 
fa) The wave function of the problem. 

c) Poincare invariance of the scalar three-body relativistic equation with oscillator interac- 
tions. 

69- The system of three relativistic spinorial particles with a Dirac oscillator interaction, 
a) Symmetries of the system of three quarks. 

70. Spectra of the three quark system with Dirac oscillator interactions and application to the 
ma^es of non-strange baryons. 

a) The Dirac c^illator case b) An alternative approach to the spinorial relativistic three 
body problem. 



Harmonic Oscillator in Quantum Theory 


863 


3. Summary of The Harmonic Oscillator in Modem Physics 

Our objective in this section is to give a brief outline of the main parts of this book and to indicate 
the previous knowledge required to understand each of them as well as to how to use the book 
more effectively. 

This book has 13 Chapters with 70 sections, and is divided essentially into five parts. 

The first one (Chapters I to V), deals with applications of many body states with oscillator 
interactions, starting with one and ending with four particles, where the analysis is complete, and 
continuing with n-particles but only in the Hartree-Fock approximation. 

To understand these first five Chapters what is required is a standard course in quantum 
mechanics combined with ample knowledge of angular momentum theory. For the latter, the 
usual Clebsch-Gordan ^d Racah coefficients of the rotation group have been put in the 3y and 
6j form, at present more familiar notations, though occasionally this adds a few phase factors to 
the formulas. As for the permutation group the analysis is self contained except for Eq. (17.15) 
which is derived in standard books of group theory, though it also could be accepted as basis for 
Eq.(17.16-18) and the reasoning continues to be self contained from there. 

The second part of the book (Chapter VI) deals with the application ef harmonic oscillators 
states to scattering problems where, at first sight, one would think that they play no role as they 
vanish at infinity. Approximate phase shifts for potential scattering are obtained explicitly as 
well variational procedures for using harmonic oscillator states in determining resonant levels. A 
standard knowledge of scattering theory given in a quantum mechanics course is required. 

In the third part (Chapter VII) a serious attempt is made to understand the group theory 
underlying the harmonic oscillator, starting with the simple case of the oscOlator for one particle 
in one dimension and ending with m particles in n dimensions. The particular case of interest 
when n = 3 is discussed with reference to some applications to nuclear structure in the 2s-ld shell. 
We tried to make the analysis completely self-contained. 

The fourth part (Chapters VTII, IX, X) deals respectively with four, five, and six dimensional 
oscillators and their application to the Coulomb problem, the Bohr-Mottelson collective nuclear 
model and the Interacting Boson Model (IBM). All what is required for their understanding is 
given in the previous sections and in particular in Chapter VII, with the exception of section 42 
where some more advanced group theoretical notions are required. 

Finally the fifth part (Chapters XI, XII, XIII) deals with the relativistic many body problems 
with oscillator interactions, though the discussion is mainly restricted to systems of one, two and 
three particles, and applied to the mass spectra of mesons (quark-antiquark systems) and baryons 
(three quark systems). Knowledge, of the elements of the special theory of relativity and, in 
quantum mechanics, of the Dirac and Klein-Gordon equations, is assumed, though otherwise this 
part is again self contained. 

In the conclusion we stress that we have touched only on some aspects of the harmonic oscillator 
in modern physics related to our own work, or to that of those with which we have come in personal 
contact. 


4. The Relativistic Particle of Arbitrary Spin 
In A Harmonic Oscillator Potenticd 

In this section we briefly sketch an extension of the Harmonic Oscillator framework described in the 
foregoing, to particles of arbitrary spins which may be characterized by the chain of groups SU (4) 3 
[S U(2) ® S U(2)]. The emphasis is on the symmetry group of the problem which is the unitary 
symplectic subgroup S'p(4) of SU(4) (rather than of SU(4) itself). And since >Sp(4) is isomorphic 
to 0(5) we can replace it by the latter and write our equation in terms of the generators of 0(5)3 
0(4)3 0(3) 3 0(2) groups. These are more convenient as there are no multiplicity indices and 
the matrix elements of the generators can be given explicitly for an arbitrary irrep ( 711712 ) of 0(5). 
The analysis is applied variationally to particles in an harmonic oscillator potenticd corresponding 



864 


Maxcos Mosbinsky 


to the irreps (| 5 ),(ll)>( 10 )!(i|) 
problem. 


We start with a brief historical introduction to the 


4.1 Introduction 

The equation of a relativistic particle of spin 1/2 was proposed long ago by Dir^ [1] and it had 
an enormous success in many appUcations. The extension of the 

given rise to a veritable flood of papers in the last 50 years. Dirac himself [2] and Fierz Pauli 
13] made proposals, but which were restricted by bothersome constraints. Bar^ann md Wigner 
4 started not with one but a system of n Dirac type of equations and obtained a particle of spin 
n/2 by restricting the wave function to the symmetric solution under permutation. K^mer [5] 
managed to obtain a Dirac type of equation but only for spins 0 and 1. In fact Mathews [6], imder 
the strict restrictions with which he worked argued that there could be no relati^tic equations 
with spin higher than 1. Bhabha [7] on the other hand again returned to the possibihty of arbitrary 
spin, though connecting them later with representations of SO(5) group as iscussed by Krajcik 
and Nieto [8]. Weinberg [9] derived the Feynman rules for any spm in which the propagators 
involve matrices that transform like symmetric traceless tensors of rank 2j. Nikitm [10] and his 
collaborators deal elegantly with relativistic particles of arbitrary spin in Coulomb and magnetic 

monopole flelds. , , . i j 

In view of the above references, and possible hundreds more that ^m less relevant, one could 

well ask if there is any reason to deal with the subject of a relativistic p^icle with arbitrary 
spin with or without interaction. There were two main reasons for getting in this crowded fleld. 
The first one was the decision to follow the Barut approach [11] that he used [12] to get a single 
relativistic equation for a many body problem, and particularize it to a single particle thus haying 
only one position r and momentum p vectors but many q’s, /3’s in the equation. As each a s is 
associated with spin n of them aUow us to reach spins up to (n/2). The second reason was due 
to noticing that alpha’s and beta’s could be represented by a direct product of the ordinary spin, 
and a new concept with the same properties as the latter which was given the name of sign spin. 

Thus the problem becomes very similar to the one in nuclear physics in which we have ordinary 
spin and isospin, and the main symmetry group goes from SU(2) to SU(4), where the latter gives 
rise to supermultiplets [13]. 

The formalism developed could be applied to any type of interaction but for simmplicity we 
shall restrict ourselv^ to an harmonic oscillator potential. 

Once we have the Hamiltonian operator of our equation appropriately formulated, we shall 
indicate a complete basis formed from the standard harmonic oscillator states in configuration 
space combined with the spin part, with the help of which this Hamiltonian can be transformed into 
an hennitian matrix of infinite number of components. Restricting ourselves to a given maximum 
number of quanta in our oscillator, the matrix becomes finite, thus giving us the possibility of 
diagonalizmg it to get the energy eigenvalues for different representation of the groups involved, 
which in turn contain a finite number of possibilities for spin. We shall discuss qualitatively the 
energy spectra when y (r) = 


4.2 The Hamiltonian expressed in appropriate chains of group 

We use c.^.s. units in which we shall indicate momenta and positions by 

r',p' (1) 

wlwre we use this notations to reserve r,p for more appropriate units. The Dirac equation for a 
spin I particle in an external field can be written as 

[ca • p' + 4“ V (r')]^ = ‘ (2) 


where 


“=(“ o).^=(a -•) 


(3) 



Harmonic Oscillator in Quantum Theory 


865 


and a is the vector of the 2x2 Pauli spin matrices. 

Our next point is to note that the 4x4 matrices a, ^ in (3) can be converted into direct products 
of 2x2 ones by introducing the definitions[l] 


/ 







’1 o' 

, 1 

'0 

r 

, 1 

'0 -i' 

1 

ri n 1 

0 1_ 


1 

0 

,*2 = 2 

i 0 

,*3 = 2 

1 

’ 1 

1 o 


a = 4s0ti, /9 = 2/0t3, 
Using (5) we can then write (2) as 


(4) 


(5) 


|4c^(si +2mc^(J 0^3) + U(/)|'0 = EV 


(6) 


where = 1,2,3 are the standard matrices for ordinary and sign spins given in (4) and 0 

stands for the direct product. The indicates the energy in c.g.s. units. When we want to go 
to a problem of larger spins[l] we introduce an index u = 1,2. . .n for all the variables appearing 
in (6), sum the corresponding Hamiltonians and then make all equal to a single r',p' thus 

getting an equation of the form 


n ^ 3 


® tiu)Pi + ® tzv) + nV{T') \ij) 






whose spin can range from (n/2), (n/2) ~ 1, . . . (1/2) or 0. 
Now we define 




n n n 

Si = ^ -f), Rij = ^ ^ l'ju)j Ti = ^ 0 tiu) (8) 

ti=l U=1 11=1 

with Si,Ti,= 1, 2, 3 being respectively the components of the total ordinary and sign spins, which 
together with the nine Rij's are the 15 generators of SU(4) group as shown by their commutation 
relations[14,15,16] 


Si , Sj 

Si , Rjk 

Rij , Rki 


— l^ijkSk-i 
= iCijiRik^ 


Ti,Tj 

Tiy Rjk 


— i^ijkTky 

~ I'^iklRjli 


Si, Tj 


- 0 , 


— "^l^ikmSm^jl ^i^ik ^jlnTrf 


(9) 


Using the definitions (8) we can rewrite Eq. (7) as 

3 

[4c + 2mc^T3 + nU (rO]^ = nE^ (10) 

i=l 

This last equation can not, in general, be solved exactly and thus we need a convenient complete 
basis in which to express the operator in the square bracket in (10) as a numerical matrix. 

The first requirement concerns a basis for the ordinary and sign spins. In [14] we characterized 
them by the chain of groups U(4)D [ S U (2)0 S U (2)], familiar in nuclear physics[13] when 
we combine the ordinary spin with the isospin to get supermultiplets. The states can then be 
expressed by the kets \{h}^satT) , where {h} = [/ii/i 2 h 3 ^ 4 ] is the partition of n corresponding to 



866 


Marcos Moshinsky 


the representation of U(4), while s(s+ 1) , + 1) are the eigenvalues of Casimir operators associated 
with the SU(2) of ordinary and sign spins, and a, r characterize the corresponding orthogonal 0(2) 

subgroups of SU (2). ^ j* • • i 

Wc notice that the ket in the previous paragraph has an extra index 'y? that serves to distinguish 
representations of SU(2) (8> SU(2) that appear more than once in a given representation of SU(4). 
This feature complicates greatly the genera/ representation of Rij in the basis of the ket given above. 
Thus we decided to follow another chain using the fact SU(4) is isomorphic to the orthogonal group 
0 ( 6 ). 

The generators of 0(6) can be characterized by the antisymmetric operators ^mm' — ^m'm 
with m, m' = 1, 2, 3, 4, 5, 6 and thus there are 15 of them that satisfy the commutation rules[17] 

[Amm'i 7\nn'] — f[dm'n An^m A nm’ +<^77177, /\m> n' Aynn] (11) 

Comparing them with the commutation rules Eq. (3.11) of Ref. [14] we easily see that f\Tnm' 
with m<m! (to avoid the repetition due to the antisymmetry) are correlated with Si , Rij , Tj , ^ , j = 
1, 2, 3 in the following way: 


^CijkAjk 

= Si 

Ai 4 

= 2Rii 

Ai 5 

= 2Ri2 

Afs 

= 2Riz 

A45 

= Tz 

A46 

= -Tz 

A56 

= Ti 


where ij, k take the values 1,2,3 and repeated indices are summed over these values. 

Now 0(6) has the following chain of subgroups 0(6 )d0(5)d 0(4)d0(3)d 0(2) whose genera- 
tors in terms of the operators (8) axe given by 

15 Si,Rij,Ti = 1,2,3 0(6) 

10 SuRiuRi2,T3 2 = 1,2,3 0(5) 

6 SuRii 2 = 1,2,3 0(4) (13) 

3 2 = 1,2,3 0(3) 

1 53 0 ( 2 ) 

where on the left hand side we give the number of generators. Using (9), we easily check that the 
generators of each subgroup close under commutation. 

We note now that in Eq. (10) only Rn and Tz appear so we can restrict, ourselves to 0(5) as 
the symmetry group. Nevertheless we consider the representations starting from 0(6) as we would 
like to characterize our kets also by the {h} which is the partition of n that was mentioned above 
and characterizes the irreducible representation of U(4). 

To achieve the purpose of the last paragraph we note that 

Ai2 = 5 $, A45 = T3, A36 = 2J?33 ( 14 ) 

commute among themselves as seen from the relations (9) or (11). They could then be considered 
as three weight generators[18] of the 0(6) group, while the 12 that remain can be divided into 
groups of 6 each, corresponding to the raising and lowering generators of the group mentioned. 
1£ we consider the state of highest weight [19], which is an eigenstate of the operators (14), the 
corr^ponding eigenvalue can be denoted by 



C; r>r' 


(16) 



Harmonic Oscillator in Quantum Theory 


867 


where r, r' — 1,2,3, 4 and we indicate the raising, weight and lowering generators. Again if we 
have a state of highest weight it would be an eigenstate of Ci , (7| ^ C'l » C'l and their eigenvalues 

(17) 

characterize the irreducible representations of U(4). In table II of reference [16] we give, in spher- 
ical component form, the Si,Ti,Rij as linear functions of and, in particular, for the weight 
generators, where the index 0 is equivalent to the index 3 of cartesian components, we have that 
(15) are related to (17) by[16] 

qi = |(hi +h 2 -hz- h^),q 2 = ^{hi -h 2 + hz- h4),qz = |(/ii - /12 - /13 + ^4), (18) 

As /ii 4- /i2 4- hs + ^4 = n with Ai > /12 > A3 > A4 we see that ^1 , 92? ^3 can be integer or semi 
integer numbers depending on wether n is even or odd and furthermore qi > q^ '> qs with qi , ^2 
being positive while, in some cases, qz could also take negative values. 

Having established the relation between the irreducible representations (irreps) of 0(6) and 
U(4), we turn our attention to 0(5) which, as we indicated before, is a smaller symmetry group 
for the Hamiltonian in Eq.(lO) as, using (12), it can be written in the form 

3 

[2c ^ Ai 4 p[ -h 2mc^ A 45 4-ny (r')]^ = nE^'ip. ( 19) 

i=l 

4-3 Matrix elements of the generators A 45 , A£ 4 , z = 1, 2, 3 in a basis of irreps 
in the chain 0(5 )d 0(4)d 0(3) D 0(2) 

As is well known [20] the irreps of 0(2 k-{-l) and 0(2k) are characterized by partitions involving 
only k numbers that can be integer or seminteger and non-negative, except for the last one in the 
even case which sometimes can be negative. 

Rather than discussing the general theory analyzed in references [20], we shall restrict our 
analysis to the chain of orthogonal groups that appear in the title of this section, where the irreps 
will be denoted as follows: 


0(5) ; 711,712 
0(4) ; mi, m 2 

0(3) ; s (20) 

0 ( 2 ) ; (7 

As 0(5) is a subgroups of 0(6), ni,n2 are restricted by the inequalities[20] 

qi>ni>q 2 >n 2 > \qz\- (21) 

Turning now our attention to 0(4), mi, m 2 are restricted by the inequalities[20] 

7^1 > ^1 ^ ^^2 > |7n2|. (22) 

For 0(3) we have the single number s restricted by[20,21] 

mi> s> |m2|. (23) 

Finally a of 0(2) is restricted by |c7| < s which implies that is given by[21| 

0- = s, s “ 1, . . . , —s 4- 1, —s (24) 

as all the values indicated can only change by one unit at a time within the limits indicated in the 
inequalities. We note then that the integer or seminteger character of the representation {qi , g2, qz) 
of 0(6) propagates to all of its subgroups. 



868 


Marcos Moshinsky 


The kets for the spin part of 0(5) D 0(4)D 0(3) D 0(2) chain of groups, can be denoted by 

nin2 
mim2 
s 
a 

and the matrix elements of A45, A 34 with respect to them have been calculated in references [22,23]. 
Before giving them explicitly here, we note that A ^4 is a Racah tensor of order 1 with respect to 
the 0(3) group and, in particular, A 34 corresponds to the component 0 of this tensor so we have 
by the Wigner-Eckart theorem that[21] 



71X712 

/ mim^ 

\ i 


A34 


71x712 

771x7712 

S 

a 


{scr, 101 ^ 0 -') 


/ nina 


ni722 \ 

( m'xTTl^ 

\ s' 

A4 

771x7712 ) 


(26) 


where (j) is a standard 0(3) Clebsch-Gordan coefficient. Thus for Ai4,A24 we need only the 
reduced matrix element on the right hand side of (26), and its explicit value, together with that 
of A 45 , is given below [22,23] 


' 71x712 
772^7712 
S 


A45 


71x712 \ 
771x7712 ) = 


1 (ttii — s + l)(7Tii + s 4- 2)(ni ■“ mi)(7ii + ttii + 3)(mi ~ 722 + l)(n 2 i +722 + 2 ) 

2 y (mi +7722 + l)(mi +m 2 + 2)(mi - m 2 + l)(mi - m 2 + 2) 

1 I {S — 7n2){s + 7712 + 1)(722 — 7n2){7l2 + 7712 + 1)(7^1 ~ 7712 + 1)(7^1 + 7712 + 2) 

Sy (mi + m 2 + 2)(mi + m 2 + l)(mi - m 2 )(mi - m 2 + 1) 

/ (s + mi + l)(mi - s)(ni - mi + l)(? 2 i +mi + 2)(mi - 722)(mi + 722 + 1) 

2 y (t72i + m 2 )(mi + m 2 + l)(mi - m 2 )(mi - m 2 + 1) 

4-1 + l)(g -b m2)(n2 - m 2 + 1)(722 + m2)(ni - m 2 + 2)(m2 + 721 + 1) 

2 y (mi + m 2 )(mi + m 2 + l)(mi - m 2 + 2) (mi - m 2 + 1) 

(27) 


j 71x712 


711722 V f 

( mi m 2 

A4 

771x7712 ) = ■ 

\ s' 


S f V 


_ J \ /(^i -5)(mi + s + 2 )(s~m 2 + I)(s + m 2 + 1)^ 

) = -t/ (STsxjyi) 


«+i 


, (mi + l)fn 2 r , . 
i y========^dsf^s + * 


\/s|s+l) 


(mi — s + l)(mi + s + l)(s — m 2 )(s + m 2 ) 
(2s — l)s 




(28) 


Note now that in Eq. (19) only Aii,Ai 5 appear, which axe generators of 0(5), and thus (ni,n 2 ), 
that give the ixrep of 0(5), are integrals of motion for the Hamiltonian operator in the square 
bracket of (19). 



Harmonic Oscillator in Quantum Theory 


869 


4.4 The complete set of variational states and the matrix elements of our 
Hamiltonian with respect to them. 

So far we have not mentioned that part of our state that is a function of r' in our configuration 
space. As, in general, the Eq. (19) does not admit an exact solution we choose the simplest set oi 
states, that of the harmonic oscillator, to carry our analysis variationally. As the frequency u of 
the oscillator is our only parameter, we can introduce it in the Hamiltonian [24] and thus consider 
only states of frequency 1 given by the ket 

(29) 

with Y being a spherical harmonic where I is the orbital angular momentum, while R is the radial 
part of the ket characterized by the number of quanta N, 

As the total angular momentum 


J = L + S, L = rxp 

is obviously an integral of motion, we can write our full ket as 


( 711712 \ 

Z, 77117712 jm 


niTi2 


an 


77117712 \ 

^ / 


(30) 


(31) 


where (|) is a Clebsch Gordon coefficient. 

We now have to apply standard Racah algebra[21] to the state (31) and we get the following 
result 


7li7l2 

N I V ^ 77l3|77l2 I jm 


2 c^ Ai4Pi + 2 mc^ A45 +nV{r') 


i=l 


nin2 \ ^ 

N j I, mim 2 1 jm 
s J I 


2{-iy‘+^-iWill'ss';j)[{2l’ + l)(2s' + 1)]^/^ 
A4 


/ riin2 

(iV'Z'lb'lliVZ) ( m'm' 


7li7l2 \ / 72x712 

77217712 / "b N^Vl \ TTlj TTl^ 


A45 


72l7l2 

77117712 

S 


+n5ln5^,rn^S,n’,rnJs's{N'l\\Vir')\\Nl) 


(32) 


where we have assumed that V (r) is only a function of the magnitude of r. 

In (32) the W is a Racah coefficient and the reduced matrix elements of Au, A45 are given in 
(27, 28) To get then the full matrix representation of our Hamiltonian we only to need determine 
the reduced matrix elements of p' and V{r^) which will be discussed in the next section where we 
shall introduce more appropriate units. 


4.5 Example: The energy spectra of a relativistic harmonic oscillator 
corresponding to a definite irrep of 0(5) 

The equation is now (12) where 

y(r') = -TTin^r'^ (33) 

with 771 the mass and Q the frequency of the oscillator, all in c^s units. 

We shall introduce dimensionless units by dividing (19) by hO, and defining 

E={hn)-^ie -mc^) (34) 

We now note that r',p' are in c.g.s units associated with a frequency uj of the variational 
states. Instead of having uj in the state (29) we introduce it in the Hamiltonian by replacing r',p' 
by dimensionless expr^ions through the relation 

r' = (ft/77ia;)^/^r,p' = (77ia;fi)“^^^p 


(35) 



870 


Marcos Moshiasky 


The equation (19) can then be written as 


; 2a£ ^(Ai4Pi) + a^(2 A 45 “") + ^ ^ = 


2=1 


where we substracted the total rest energy and 


(36) 




(37) 


and £ will be now the variational parameter against which we plot the energy E. 

The matrix representation of the left hand side of Eq. (36) can be obtained in a way similar 
to (32) replacing r', p' by r, p so we need only to introduce in (32) the well know [24,25] reduced 
matrix elements 


(N'l'IbllM) 

= [(iV 4 / + 3)'/='5iV'/v+i + {N- 
4 ^(IV’ — 1 4 2)^^^(5jv'jv+i 4 (iV 4/4 l)^^^d^<jv-ij \ 


{ 1 4 1 ) 


(21 4 3) 


Si‘ 


/41 


( 21 - 1 ) 


5v 


i-i 


{N'l'\\r%m) = _i7(A^-/)(iV4l4l)W-2 

4(iV 4 §)<5iV'W - {l/2W{N - 1 4 2)(iV 4 1 4 3)<57V'iv+2 


(38) 


Prom (27), (28), (32) and (38), (39) we can then write the matrix representation of the operator of 
the left hand side of (36), in which the irrep {nin 2 ) of 0(5) is an integral of motion together with 
the total angular momentum j. This implies that in (32), {nin 2 )J are fixed and the same in bra 
and ket. 

In the following subsections we shall discuss simple particular cases of this matrix taking the 
lowest \alue of j allowed i.e. j = 0 if n is even or j = | if n is odd. 

In the title of the examples given below we have both the partitions {h} irrep of U(4), and 
(^i« 2 ) ftrep of 0(5) related by (21), as well as the values of j mentioned in the previous phrase. 


a) The case {h} = 1, (nin 2 ) = (||), j = 1/2 


This corresponds to the ordinary Dirac equation in an oscillator potential and from the inequalities 
(22) to (24) we see that 0(5) part (25) takes the form 



(39) 


As J = could only be 0 or 1 and this is given automatically if iV” is even or odd. 

Thus the kets (31) could be written in the short hand notation 

m2 = ±|, iV = 0,1,2,.... (40) 

Ordaring the kets by increasing values of N and, instead of m 2 = db|, just denoting it as 
fii 2 = ± the »t of states, (where we suppr^ the ket notation), can be indicated by 

Q+O-i+i- ^ ^ _ ^4+14^15+ ^ ^ ^ ^4JL) 

wlMie we end with an to get a better fit to the pc^itive energy levels. The energy levels (both 
p«itive and negative) can be plotted as a function of the variational parameter e, for different 
values d n, with its small/large valu« corresponding to the relativistic/ non-relativistic domains 
impectively. In particular, for n = 10 we have the non-relativistic limit, and the positive energies 
are in fact cloK to the value £ = (iV -f |), at least for the levels up to iV = 10. 



Harmonic Oscillator in Quantum Theory 


871 


b) {h} = {2},(ni,n2) = (1,1);; =0 


The present case concerns n = 2 in which the spin s can take values 0 and 1. Since we consider 
here the iow^est possible value of total spin j = 0,/ can take only two values viz 0 and 1. Thus even 
values of N occur with / = 0,s = 0, while odd N corresponds to the combination / = = 1. In 

the chain of groups 0(5)3 0(4) 3 0(3) 30(2), the symmetric partition {2} can have the following 
ket 

11 

'?) 


with 


7712 = d:l, 0 and 5 = 0, 1. 


(42) 


s = 1 can have all possible values of m 2 , viz ^ ±1 or 0 while 5 = 0 can correspond to only m 2 = 0. 
Since all quantum numbers except s and m 2 are fixed in the above ket (with a not occurring 
explicitly in the matrix ) we can denote the states in short hand notation as 

(43) 

which according to above arguments would give rise to following chain of states in the matrix. 

00011111011-12003113^031^1^^ (44) 

We restrict the hamiltonian matrix in this chain of states to 140^ whose diagonalized eigenvalues 
(E) may again be plotted against the variational parameter e for different values of a. The general 
behavior of the curves is governed by Eq. (36). When € -4 0 the kinetic energy as well as the rest- 
mass terms become negligible compared to the potential energy which goes to + 00 , thus £ 00 . 

But as € — > 00 , the potential energy term goes to zero and the equation reduces to that of free 
particle whose energy spectra is given by [26] 


rr l - a^; rn = 0, 1, 2, ... n 


(45) 


In particular forn = 2, there appear a set of curves corresponding to positive energies forming 
bound states, a negative continuum and a set which converges to as e ->■ 00 . The chain of 
states (44) gives rise to positive bound states occurring for odd values of N, thus the ground state 
does not exist for j = 0. Further, a gap of (or mc^) exists between the three sets of levels 
starting firom where bound states are formed. Small values of a correspond to a fully relativistic 
region (e.g., a = 1) wherein the bound states are formed at higher values of e which no longer 
follow the non-relativistic pattern. 

c) {/i} = {ll},(ni,n 2 ) = (l,0);i = 0 

The antisymmetric partition {11} in the canonical chain of orthogonal groups correspond to the 
following ket 

10 
0 
s 
a 

which can be written for convenience as 

) (47) 

In the present case the chain of states can be written as 

OOI 0 OO 1 II 2 OI 2 OO _ (43) 

which we restrict to 15^^ while calculating the energy matrix, the behaviour is similar to the 
previous case of {2} except that the present case gives rise to bound states with even values of N. 

Table 1 gives the results for bound states for non-relativistic case a = 10, which appear at 
e = 1.02 for all partitions discussed. Table 2 shows the bound states for fully relativistic region of 
a = 1. For {!}, they are formed at € = 2.5 while for the other cases of j = 0,n = 2, they appear 
at 6 = 2.62. 


with 


mi = 0,1 and s = 0, 1. 


(46) 



872 


Marcos Moshinsky 


Table 1 

Energy levels for non-relativistic case with a = 10,e = 1.02 


N 

N-t-3/2 

{1} 

{11} 

{2} 

{3} 

0 

1.5 

1.4991 

1.5028 



1 

2.5 

2.4880 


2.4868 

2.4778 

2 

3.5 

3.4806 

3.4843 


3.4520 

3 

4.5 

4.4624 


4.4613 

4.4296 

4 

5.5 

5.4479 

5.4516 


5.3911 

5 

6.5 

6.4227 


6.4217 

6.3582 

6 

7.5 

7.4013 

7.4049 


7.2849 

7 

8.5 

8.3694 


8.3685 


8 

9.5 

9.3412 

9.3447 



9 

10.5 

10.3027 


10.3018 


10 

11.5 

11.2679 

11.2693 



11 

12.5 

12.2239 


12.2229 


12 

13.5 

13.1829 

13.2333 



13 

14.5 

14.1314 


14.0705 


14 

15.5 

15.0299 

15.0174 




d) {/t} = {3},(ni,n2) = (3/2,3/2);i = 1/2 

The symmetric partition {3} in the canonical chain of groups 0(5)D 0(4) p0(3) D 0(2) corre- 
sponds to the state function 

3/2, 3/2 

3/2 m2 \ ^49^ 


where m 2 = ±3/2, ±1/2 for 5 = 3/2 and m 2 = ±1/2 for 5 = 1/2. Since ( 5 , 7712 ) are the only 
variables (with a not occurring in explicit form in the Hamiltonian (32)) and ( 711 , 712 ); mi are fixed, 
we can use the short hand rotation for the above ket. Thus the matrix for the Hamiltonian 
(32) with j = 1 /2 would have the following chain of states 

Ql/21/2Ql/2-l/22i/2 1/221/2-1/22^3/2 3/223/21/2^3/2-1/2^3/2-3/2 


Since for iST = 0, 1 = 0; s = 1/2 and m 2 = ±1/2, positive ground state occurring for s = 3/2, 
does not appear in this chain of states for j = 1/2. We limit the matrix up to with the 

states occurring in the same order as in (50) which implies that the size of the matrix is 43x43. 
This particular choice of basis states is made in order to get the best fit for the physical positive 
bound states. In the present case of n = 3, positive states with 5 = 3/2 occur for even as well 
as odd vain® of iV as seen from (50) with no ground state occurring for iV = 0. The present 
case of 71 = 3 gives rise to energy curv^ which though starting from 00 at € = 0, get separated 
into four ‘Poors’* (see (45)) (one extra state appearing separately from the four floors is due to 
the choice of basis with no additional physical significance). Since it is only the uppermost energy 
floor which giv^ rise to the physical states (while the bound state in the other floors formed out 
of mixture with the negative energi^, do not contribute to the physical states). We give in table 
1 , the corresponding minima E which occur at e = 1.02 for a = 10 and at e = 1.73 for a = 1. Note 
that for non-relativistlc limit of a = 10, the energies seem to follow the rule £7 = iV ± 3/2. The 
behaviour of all the floors is yet to be understood in the relativistic case. 



Harmonic Oscillator in Quantum Theory 


873 


Table 2 

Energy levels for relativistic case a = 1 


e = 2.5 


N 

N+3/2 

{1} 

0 

1.5 

1.5025 

1 

2.5 

2.0126 

2 

3.5 

2.7806 

3 

4.5 

3.3581 

4 

5.5 

4.0391 

5 

6.5 

4.7173 

6 

7.5 

5.3458 

7 

8.5 

6.1216 

8 

9.5 

6.7085 

9 

10.5 

7.5947 

10 

11.5 

8.1450 

11 

12.5 

9.1860 

12 

13.5 

9.7019 

13 

14.5 

11.0286 

14 

15.5 

11.5087 


6 = 

2.62 

e = 1.73 

{11} 

{2} 

{3} 

1.5315 

2.0291 

2.5654 

2.8123 

3.4386 

2.6308 

3.5688 

4.1364 

4.8720 

3.6134 

4.7359 

5.5244 

6.3544 

4.7450 

6.9688 

7.9121 


8.4824 

9.5993 


10.121 

11.5602 


12.1073 




Conclusions 

We have derived a wave equation for a relativistic particle with arbitrary spin using the generators 
of the 0(5)D 0(4 )d 0(3) chain of groups. We did not discuss the Lorentz invariance of our 
equation as its initial formulation is in terms of the a’s rather than the 7’s matrices. We shall use 
the latter in another[27] publication showing not only that the equations are Poincare invariant, 
but also that they lead through our simple supermultiplet formulation to the Bhabha equation 
proposed long ago[7]. 

Acknowledgement 

The material of Sect .4 was taken from two papers in which the author participated: 

L Supermultiplets and relativistic problems I. The free particle with arbitrary spin in a magnetic 
field [28]; 

II Analysis of relativistic particles of arbitrary spin through different chains of groups [29]. 

Unfortunatly certain graphs for the Energy levels versus the HO Quantum Numbers could not 
be reproduced in this Article due to technical problems. However the same may be found in ref 
[29]. 

The author would like to express his thanks to the collaborators in the two papers [28,29] and 
in particular to Dr. Anju Sharma for the calculation of the tables as well as to Mrs. Fanny Arenas 
for the preparation of the manuscript and diskette. 


References 

[1] P.A.M. Dirac “The principles of quantum mechanics” Fourth Edition (Oxford at the Clarendon 
Press 1959) pp.252-275. 

[2] P.A.M. Dirac, Proc.. Roy. Soc. A, 155, 447 (1936). 

[3] M. Fierz and W. Pauli, Proc. Roy. Soc. A 173, 211 (1939). 

[4] V. Bargmann and E. P. Wigner, Proc. Nat. Acad. Set. (USA) 34, 211 (1948). 



874 


Marcos Moshinsky 


[5] N. Kemmer, Proc. Roy. Soc. A, 73 (1939). 

[6] P.M. Mathews and B. Vijayalakshmi J. Math. Phys. 25, 1080 (1984). 

[7] H. J. Bhabha, Rev. Mod. Phys. 21, 451 (1949). 

[8] R. A. Krajdk and M. Martin Nieto Am. J. Phys. 45, 818 (1977). 

[9] S. Weinberg, Phys. Rev. 133, B1318, (1964). 

[10] W. 1. Pushchich, A. G. Nikitin, W.M. Susloparow, JZ Nuovo Cimento 87, A, 415, (1985). 

[11] A. 0. Barut, S. Komy, Fortsch. Phys. 33, 6(1985); A. 0. Baxut and G. L. Strobel, Few Body 
Systems 1, 167 (1986). 

[12] M. Moshinsky, G. Loyola, C, Villegas J. Math. Phys. 32, 373 (1991). 

[13] E. P. Wigner, Phys. Rev. 51, 106 (1937). 

[14] M. Moshinsky and Yu. F. Smirnov J. Phys. A: Math. Gen. 29, 6027 (1996). 

[15] M. Moshinsky and J. G. Nagel, Phys. Lett. 5, 173 (1963). 

[16] M. Moshinsky, J. Math. Phys. 7, 691 (1966) 

[17] M. Moshinsky, Group Theory and the many body problem (Gordon and Breach, New York, 
1968) pp. 36. 

[18] M.Moshinsky, loc. cit. p.l4 

[19] I. M. Gelfand and M. L.Zetlin, Dok. Akad. Nauk. USSR 71, 147 (1950) (In Russian). 

[20] S.C. Pang and K. T. Hecht , J. Math. Phys 8, 1233 (1967) . 

[21] M. E. Rose, Elementary Theory of Angular Momentum (John Wiley and Sons, New York 
1957) pp. 85-88, 115-119. 

[22] G. F.Filippov, V. 1. Ovcharenko, Yu F. Smirnov Microscopic theory of collective excitations 
of atomic nuclei. Kiev, Nauka Dumka 1981 (In Russian) pp. 252-254 

[23] A. G. Nikitin and V. V. Tretynik, J. Phys. A: Math. Gen. 28, 1655 (1995) 

[24] M. Moshinsky and Yu. F. Smirnov, The Harmonic Oscillator in Modem Physics^ (Harwood 
Academic Press, The Netherlands 1996) Eq. (10.35) in p. 35; Eq. (3.2) in p. 3. 

[25] M, Mc^hinsky and C. Quesne, Ann. Phys. (N.Y.) 148, 462 (1983). 

[26] M* Moshinsky, Rev. Mex. Fis. 43, 511 (1997) 

[27] M. Moshinsky, A. G. Nikitin, A. Sharma, Yu. F. Smirnov. J. Phys. A:Math. Gen. 31, 6045 
(1^8). 

[28] M.M<»hinsky and Yu F Smirnov, J. Phys. A: Math Gen 29 6027 (1996). 

[29] MJfehinsky, A. G. Nikitin, A. Sharma and Yu. F. Smirnov, Rev. Mex. Fis. 44 , Supl. 2, 1 
(19^). 



Conclusion 

33. Modern Perspectives On Foundations Of Quantum Mechanics by D.Home 




33. Modern Perspectives On Foundations Of 
Quantum Mechanics 


Dipankar Home* 

Dept, of Physics, Bose Institute, Calcutta 700009, India 


1 Introduction 

The foundational questions of quantum mechanics are usually dismissed on the ground that they 
stem primarily from subjective predilections, aesthetic considerations or from “classical” prejudices. 
However, what we are increasingly learning from studies in recent years is that such conceptual 
issues are not restricted to the realm of abstract philosophy alone but have hard-core physical 
relevance. A number of key foundational problems have become more precisely formulated and are 
either already amenable to experimental studies or promise to do so in near future with ingenious 
new ideas being explored, coupled with rapid advances in the relevant technology. Those who 
deplore the rift between science and philosophy which began to develop in the eighteenth century 
may take comfort in the blending of these two exhibited in the current research scenario related to 
the foundations of quantum mechanics. As Shimony [1] suggests, they will find here a vindication 
of the old sense of “Natural Philosophy”. 

Here it may be useful to recall the background of the genesis of the general theory of relativity. 
The triggering element was Einstein’s realisation that the equivalence principle implied that one 
could not construct a theory of gravitation compatible with the principle of relativity by restricting 
the space - time transformations to those that belong to the Lorentz group. This was then a mere 
conceptual lacuna in the fabric of the existing Newtonian theory which was otherwise mathemati- 
cally coherent and compatible with all the relevant empirical data known as the time. However, the 
need to remove this flaw at the conceptual level led ( at least in hindsight) to the birth of general 
relativity which in turn predicted new testable results not envisaged by the Newtonian theory. 
What this story tells us is that a conceptual dilemma can become creative if it is formulated in 
a sufficiently precise manner, both mathematically and conceptually. It is this motivation that 
underpins the present phase of investigations on the foundations of quantum mechanics. 

The purpose of this article is to provide some flavour of the significant new developments in 
this area. Since the scope of this article is rather limited, we will restrict ourselves to the basic 
ideas and pertinent conceptual aspects, leaving out the mathematical and experimental details for 
which appropriate references wiU be given. The central issues at the core of the foundations of 
quantum mechanics are the quantum measurement problem and quantum nonlocality which will 
be discussed in Sections 2 and 3 respectively. 

2 The Quantum Measurement Problem 

The question of whether there is a need to go beyond the standard interpretation so that alternative 
approaches require to be pursued may seem to be largely dependent on subjective predilections. 
This is, however, not true. There are compelling physical reasons for suspecting that the standard 
framework of quantum mechanics is fundamentally inadequate, though its empirical succ^ to 
date is unquestionably impressive. An outstanding puzzle which underscores a subtle inner incon- 
sistency within the standard framework is the quantum measurement paradox. This requires for 

* Email; ciliom@boseinst.eraet.iii. 



878 


Dipankar Home 


its satisfactory resolution either a realist interpretation of quantum mechanics or an appropriate 


modification of the standard formalism or both. 

The central problem is that the very occurrence of a definite individual outcome (undeniably 
a fact of experience) cannot be ensured in an entirely consistent way within the standard frame- 
work of quantum mechanics. Accepting the statistical feature of quantum mechanics (that the 
results of measurement can only be predicted statistically) as inescapable, a fundamental signif- 
icance is ascribed to probability distributions which are reproducible and quantum mechanically 
predicted. Then it becomes imperative to guarantee the objective reality of individual outcomes 
whose collection enables us to test the computable probabilities. 

In order to explain the essence of the quantum measurement problem, we first consider the 
quantum mechanical treatment of a typical measurement process. We turn to a particular example 
of the formation of an a-particle track in a set of photographic plates. We briefly indicate the main 
ingredients, following the presentation given by Bell [ 2 ]. We assume a highly simplified model of 
a photographic plate made up of monoatomic layers of atoms (whose thermal excitations are 
ignored), each with only one possible excited state. We neglect the possibility of elastic scattering 
(i.e., the possibility of scattering without excitation) of the incident a-particle from these atoms. 

Let the observed a-particle originate with momentum ko from a source at position and its 
initial state represented by : 

M-^) = ( 2 - 1 ) 

If ^ is the ground state of the stack of photographic plates, then the initial combined state is 
simply = 'tpQ<l)Q. We enumerate the atoms of the stack by n(=l,2,3,...). Due to interaction with 
the incident a-particle, these atoms are excited and subsequently ionized. Let 0(ni , 713 , ns...) denote 
a state of the stank where atoms ni, n 2 , ns.-.are excited. After the a-particle-stack interaction, using 
the usual multiple-scattering approximation to describe the scattered waves, the final combined 
state can be written in the form [ 2 ]: 

= EivEni.ni 




X/iV-lC/iiV-l) X 




til 


(2.2) 


which is a sum over all possible sequences of excitations of N atoms, with rt denoting the 
position of atomni, ft of atom na, and so on; pn is the angle between ft — fJITt 3 -nd — ft 
(T^ - f]^ forn = N); fnip) is the inelastic scattering amplitude for an a-particle of momentum 
incident on a single atom; kn = - ^)^^^, where e is a measure of atomic wave functions 

by using, say, the Born approximation. The form of /n(p) determines that the atoms ionised by 
excitation lie approximately on a straight line pointing toward the source of a-particle. Thus the 
correlation between atomic excitations described by a;wave function of the nonfactorisable form 
( 2 . 2 ) Carrie information about the passage of a triggering a-particle. 

A generic feature of ail the examples of measurement analysed quantum mechanically, such as 
the one outlined above, is the following. If a system is initially in a state = a^i *f 6 ^ 2 ) which 
is a superposition of two states and tp 2 that are eigenstates of a dynamical variable which is 
measured, a general characteristic of its interaction with a measuring device is that it results in a 
final state of the form 


= a^i -I- (2.3) 

where #1 and $2 are mutually orthogonal and macroscopically distinguishable states of the 
device. It is an ineluctable feature of linear unitary quantum mechanical treatment of any mea- 
surement process that the final state of system coupled to measuring apparatus has the entangled 
(nonfactorisable) form given by Eq.( 2 . 3 ). 

Origin of the much debated measurement problem [ 3 - 9 ] lies in the measuring of a pure state 
wave function in quantum mechanics which gives rise to an inherent incompatibility between a wave 
function of the form (2.3) and the occurrence of a definite measurement result. A pure state in 
quantum mechanics means that each member (in this case, a system coupled to an apparatus) of 
an ensemble described by a pure state $ as given by Eq.( 2 . 3 ) has the same wave function Thus 



On Foundations Of Quantum Mechanics 


879 


a pure state in quantum mechanics corresponds to a homogeneous ensemble whose members axe 
indistinguishable. On the other hand, all measurements culminate in the final ensemble of systems 
coupled to apparatus which is essentially heterogeneous. A heterogeneous ensemble is, however, 
represented by a mixed state in quantum mechanics. Since within standard quantum mechanics 
under no unitary time evolution a pure state can evolve into a mixed state (see, for instance, 
[S] PP- 87-88), how to coherently accommodate within quantum mechanics the occurrence of 
distinguishable outcomes is thus an intriguing “paradox”. Not surprisingly, Weinberg [10] has 
called this “the most important puzzle in the interpretation of quantum mechanics” . 

Though reproducible statistical frequencies of events are quantum mechanically computable, 
the imderpinning concept of an individual definite event is difficult to accommodate within the 
framework of standard quantum mechanics. It is thus a logical non sequitor to speak of probabilities 
of various outcomes when the very occurrence of an individual outcome is not ensured. Recall that 
because of the entangled nature of the system-apparatus combined state, it is not permitted in 
quantum mechanics to assign a separate definite state to any individual apparatus; nevertheless, 
in any given experimental run, we observe what happens to an individual apparatus coupled with 
a system subjected to measurement. 

The quantum measurement riddle is sometimes dismissed as fundamentally the same as the 
situation in classical statistical mechanics. This point is argued for example by referring to “a 
very close analogy between this quantum mechanical problem and the purely classical problem 
of the tossing of a coin” . Since unknown and uncontrollable elements are inherent in the exact 
specification of relevant initial conditons, classically the probability for heads is taken as 50%. Until 
we inspect the outcome after tossing a coin, the classical probability is 50% for either outcome. 
Now if, say, heads is observed, the probability for heads becomes 100%; there is however nothing 
fundamentally problematic about this feature. As Bell [11] remarks about such classical examples 

at least one can envisage an accurate theory to which the restricted account is an approximation. This 
is not possible in quantum mechanics ....It could also be said that even in classical mechanics the human 
observer is implicit for what is interesting if not experienced? But even a human observer is no trouble 
(in principle) in classical theory - he can be included in the system (in a schematic way) by postulating a 
“psycho-physical parallelism” - i.e., supposing his experience to be correlated with some functions of the 
coordinates. This is not possible in quantum mechanics, where some kind of observer is not only essential, 
but essentially outside. (italics ours) 

This point brings us to one of the crucial components of the measurement problem : An outcome 
has an objective reality (recorded in terms of changes in the particle properties of the macroscopic 
measuring device) in the sense of being both intersubjective and out there that can be inspected 
at will at any instant without perturbing the outcome. Unlike classical mechanics, there is no 
counterpart of this feature within the theoretical firamework of standard quantum mechanics. In 
other words at the end of a measurement process described quantum mechanically, it is not possible 
to attribute to a measuring macroapparatus a quantum mechanical representation of its state that 
embraces a description of all the properties we require it to have. We also mention that the point 
of the measurement paradox and all the preceding arguments remain entirely unaffected even if 
the initial state of the macroscopic apparatus is taken to be a mixture of different states rather 
than a pure state (see Wigner [12] and Leggett [13] for an explicit discussion). For the simplicity 
of notation we confine ourselves to assigning a pure initial state to the macroapparatus. 

The acuteness of this paradox makes us suspect that quantum theory is intrinsically inexact 
or at least ambiguous at a fundamental level. However, various versions of what may be called 
the standard approach have tried to alleviate this doubt or minimize this paradox with different 
arguments. 

2.1 Variants of the Standard Solntion and Their Inadequacies 

2.1.1 Bohr-Heisenberg Viewpoints 

Bohr recognized that if the measuring device were d^cribed quantum mechanically, its interaction 
with the measured system would merely extend the chain of inference without leading to a definite 



880 


Dipankar Home 


result He sought to avoid this problem of infinite regress by decreeing that the interaction between 
an object and an apparatus is a single unanalyzable whole and the apparatus must be described 

'^Sl^ediate criticism of the preceding point of view is that Bohr makes the very concept of 
classical measuring instruments “which serve to specify the conditions under which the phenomena 
appear” [14] depend on an ill-defined limiting procedure. In expressing what Bohr calls a simple 
logical demand”, he takes for granted that the composite system in a measurement process can be 
envisaged as composed of two distinct parts: the measured system and the measuring appmatus. 
But there is no precise cirterion definable within standard quantum mechanics that delineates the 
borderline between the two. We merely find it convenient to consider some parts of the global 
system as parts of the instrument or is it determined a priori in a more physical way? Is it at 
all possible within the framework of quantum mechanics to assign a classical description to the 
measuring apparatus? Such questions remain unanswered in Bohr’s writmgs. 

We now turn to Heisenberg’s advocacy of the Bohrian position. Heisenberg [15] admits that 
“it is not possible to decide, other than arbitrarily, what objects are to be considered as part of 
the observed system and what as part of the observer’s apparatus,” but he stresses that once the 
macroscopic level is reached, since a cut is necessary to avoid the problem of infinite regress of not 
ensuring a definite outcome, it is of no practical importance where we put the split. According to 
Bell [16], the Heisenberg dictum was to “put sufficiently much into the quantum system so that 
the inclusion of more would not significantly alter practical predictions” . Though this recipe is 
useful, it is ambiguous in principle. There is no fundamental reason why the physics involved in 
measurements should differ from how other physical interactions are described. Hence the very 
legitimacy of such a conceptual discontinuity, not so much whether the precise location of its 
position matters, is the crux of the issue. 

In his later years Heisenberg’s position shifted to a subjective apparoach to the measurement 
problem. As a corollary to the Bohrian thesis that a measuring device must necessarily be de- 
scribed in terms of classical concepts, Heisenberg inferred that if a measurement interaction is 
described quantum mechanically it contains ‘‘new elements of uncertainty” because connection 
with the external world is one of the necessary conditions for the measuring apparatus to perform 
its function” [17]. In the words of Heisenberg [18], “It is the discontinuous change of our knowledge 
in the instant of registration that has its image in the discontinuous change of the wave function”. 

We highlight certain criticisms. First ,the very idea of a change in knowledge obtained through 
a measurement (not whether such a change corresponds to an actual physical process described by 
the theory) assumes the occurrence of a definite outcome (otherwise how do we extract relevant 
measurement information). However a definite outcome is in itself incompatible with the system 
apparatus joint pure state at the end of a measurement process. It is thus difficult to maintain the 
logical tenabiiity of the preceding view unless this central problem is considered. 

Heisenberg [17] realizes this and notes that to link the mathematical representation of quantum 
theory “to the question of how real or possible experiments will result” in a definite outcome, we 
must describe the system apparatus combination in a mixed state. To ensure this, the idea that 
a macroapparatus is inevitably coupled rather strongly with its environment was introduced [17]. 
Heisenberg daiTTiR that “the compound system of system and measuring apparatus is therefore 
now described mathematically by a mixture”. The underlying justification rests on contention 
that since the interaction between an apparatus and its environment (having a large number 
of degree of freedom) contains uncontrollable terms that differ from sample to sample, due to 
incomplete knowledge about the environment, the final state of system and apparatus can be 
effectively represented by a mixture of states. 

Nevertheless the preceding line of reasoning ignores the key fact that whatever the uncon- 
trollable uncertainti^ in the total Hamiltonian, the entire combination of system and apparatus 
coupled to environment is bound to be finally left in a nonfactorisable pure state (corresponding to 
a homogeneous ensemble of systems described by the same composite wave function) under unitary 
quantum mechanical evolution. See Section 2.1.3 of this article for a proof that irrespective of the 
details of any interaction, a pure state can evolve unitarily only into a pure state. 

Hoice the crux of the quantum measurement problem has remained unaddressed. One may 



On Foundations Of Quantum Mechanics 


881 


of course contend that the reduced density matrix of system and apparatus (obtained by tracing 
over the environment states) is diagonal. With this assumption Heisenberg [17] suggests using “a 
statistical mixture in the mathematical representation of the larger system composed of the system 
and the measuring apparatus.” This is , however, irrelevant to the question at issue because the 
total density matrix of system and apparatus and environment is actually nondiagonaL It does 
not matter how small the nondiagonal matrix elements are, so long as they are not zero. 

2.1.2 D ecoher ence Approach 

A popular line of argument (called the decoherence approach) accepts the wave function description 
(2.3) of the combined state of system and apparatus as a correct and complete representation of 
the final state but tries to reconcile the occurrence of a definite outcome with the formalism 
by contending that it is practically impossible (i.e, possible in principle only through extremely 
complex measurements) to distinguish the pure state denoted by Eq.(2.3) from a statistical mixture 
of states and 'ip 2 (h • 

Rationale for this argument is examined as follows (see, for example, Zurek [19] and Gottfried 
[20]); a particularly incisive exposition with an appropriate critique is given by Leggett [3,21]. Due 
to orthogonality of the apparatus states and ^ 2 ? it is clearly not possible to distinguish the pure 
state (2.3) from the corresponding mixture by measuring the expectation value of a dynamical 
variable pertaining to the system alone : We must measure an appropriate dynamical variable 
(say, ft) of the apparatus as well. Then a necessary condition for discriminating between the pure 
state (2.3) and the corresponding mixture is that off-diagonal matrix elements fii 2 and be 
nonvanishing where : 

<j>i \ \ ^2 y ^ I ^ I > (2-4) 

Bearing in mind that 4^i , 02 are states of a macroobject comprising a large number of particles, 
it follows that these matrix elements are nonzero only if the operator fi simulstaneously changes 
the state of a very large number of particles. Typical operators, such as the total momentum or 
the angular momentum, are sums of single particle operators (not, say, products), and therefore 
these change the state of only one particle at a time. Thus there is no simple dynamic property of 
a macroapparatus whose measurement enables us to distinguish the pure state in question from a 
mixture. 

Moreover, the inevitable strong coupling of a macrosystem with its environment gives rise 
to further difficulties. The apparatus states 0i,02 become entangled with environment states. 
Consequently, because of the orthogonality of environment states, no measurement of a property 
of the system and apparatus alone produces a result other than that expected from a classical 
mixture of states 0101 and 0202* The only way of discriminating between such a mixture and 
a pure state (involving environment states) is to measure correlations between a macroapparatus 
and its environment. This is, however, extremely difficult in practice because of the dissipative 
nature of the interaction with the environment, which is fatal to coherence embodied in a pure 
state. For a detailed articulation of this point, see Leggett [22]. 

Since it is difficult to envisage a realistic experiment to discern the effects of phase coherence 
embodied in the pure state description at the end of a measurement process, it is considered nat- 
ural to assume the pure state behaves as if it were a mixed state representing a heterogeneous 
ensemble of systems coupled with apparatus, where different outcomes correspond to distinct ap- 
paratus states. While this point of view has motivated a great deal of elegant theoretical work 
underscoring the difficulty in observing the effects of interference among macroscopically distin- 
guishable outcomes (for an overview see for example Onrnes [23] and van Kampen [24]), so far as 
solving the basic conceptual problem is concerned, these efforts are unsuccessful. 

In spite of the fact that decoherence effects are important in accounting for the usual ab^nce 
of quantum interference effects in the macrodomain, they are irrelevant as far as the measurement 
paradox is concerned. This is essentially because the interpretative shift from the notion “a pure 
state of entangled system -h apparatus behaves as if it were a mixed state” to “a pure state is 
actually a mixed state” entails a major logical non sequitor [25]. The fundamental distinction 
between a pure state and a mixed state inherent in the formalism of quantum mechanics cannot 



882 


Dlpankar Home 


be ignored by slipping in such an ad hoc interpretative drift. According to Bell [26], we do not 
have here a resolution of the fundamental problem, “but a change of the theory at a strategically 
well chosen point.” The crucial point, as Bell says, is how come an and is converted into or? 

It is difficult to accept that the inability to observe interference between macroscopically distinct 
outcomes (which in principle exist in any coherent superposition) by itself authorizes an individual 
alternative. That is, how a pure state is interpreted (corresponding to a homogeneous ensemble 
of indistinguishable members) cannot be abruptly changed when we reach the macrolevel merely 
because the relevant evidence in terms of interference is inaccessible. The punch line is as Leggett 
[25] puts it: 

can the meaning of the formalism change radically, just because the evidence has disappeared? 

For further critiques of the decoherence approach, see for example Home and Whitaker [27], 
Bell [26], Leggett [3,21]. 

The logical situation remains essentially unchanged in what is known as the consistent histories 
interpretation which also relies on the decoherence approach. The only difference here is that we 
discuss state sequences (histories) rather than a state at a single instant. A key assumption is that 
measured quantities are correlated with decohering histories and only decohering histories can be 
assigned probabilities [28-32]. This really means that the absence of interference between different 
alternatives is taken as a justification for a particular alternative. Proponents of this scheme argue 
that the correlation between a microphysical quantity and an appropriate macroclassical variable 
together with the specification of mathematical conditions ensuring the interference between differ- 
ent histories to be unobservably small for all times that completely define a measurement situation. 
It is thus obvious that in such a program, the fundamental question as to how a particular history 
is actually realised remains unaddressed. 

2.1.3 Dirac- von Neumann Projection Postulate 

In contrast to that of Bohr or Heisenberg, the Dirac- von Neumann approach seeks to negotiate the 
measurement riddle by invoking an additional axiom, popularly known as the projection postulate. 
The necessity of adding such a postulate to the quantum formalism was first addressed by Dirac [33] 
at the 1927 Solvay Congress. Later he introduced a more explicit statement in his famous book [34], 
viz., “a measurement always causes the system to jump into an eigenstate of the dynamical variable 
that is being measured, the eigenvalue this eigenstate belongs to being equal to the result of the 
measurement.” Note that this form of projection postulate is fundamentally inadequate in as much 
as it eschews the entangled nature of system and apparatus by restricting attention to an abrupt 
change in the wave function of the measured system alone. If a measurement process is subjected 
to the Schroedinger evolution, we cannot speak in terms of an independent separate quantum 
mechanical state assigned to a measured system. This deficiency is remedied in von Neumann’s 
formulation [35] by postulating the wave function collapse in terms of a dynamic transition from 
a pure state (of system and apparatus) into a mixed state. 

It is straightforward to show that irrespective of the specifics of any interaction, a pure state 
evolving according to the Schroedinger equation cannot evolve into a mixed state. A convenient 
mathematical representation of the distinction between a pure state and a mixed state is given 
in terms of density matrices. A pure state is characterised by = p, whereas for a mixed state, 
p^ ^ p. Recall that the Schroedinger time evolution of an initial density matrix p(to)obeys the 


following rule : p(t) = u{t, to)p{to)U^{t, to) (2.5) 

where U is unitary satisfying 

UHt,to)Uit,to) = Uit,to)Wit,to) = I (2.6a) 

U(to,to) = I (2.6b) 

The explicit expression for U is given by : 

U{t, to) = exp[{~i/ti)H{t - to)] (2.6c) 

Rxjin Eq.(2.5) it follows that ; 

p2(t) = U{t,to)pito)Wit,to)U(t,to)pito)W{t,to) (2.7) 

Using Eq.{2.6a) 

pHi) = mt,to)p^(to)UHt,to) ( 2 . 8 ) 



On Foundations Of Quantum Mechanics 


883 


If = p{to), then using Eq.(2.5) we obtain from Eq.(2.8) 

p^{t)=p{t) (2.9) 

Thus a pure state can evolve only into a pure state through a linear unitary Schroedinger 
time development. Invoking the projection postulate therefore implies a measurement-induced 
evolution not governed by the Schroedinger equation. To put it more formally, the projection 
postulate means a discontinuous nonunitary change of p =| >< $ [(where | > denotes the 

pure state of system and apparatus generated by the Schroedinger treatment of a measurement, 
say, of the form given by Eq.(2.3) given by 

p -^ p‘ =\ a PI >< II >< 01 I 

4- I b PI 02 >< 02 II 02 >< 02 I (2,10) 

where p' denotes a mixed state of | 0i >| 0i > and | 02 >| 02 > with weight factors | a pand | 
6 P, respectively. Note that in this scheme, the disappearance of coherence in the postmeasurement 
density matrix p' is regarded as a consequence of the collapse postulate made at the wave function 
level. 

Note that von Neumann recognised that the projection postulate entailed “a peculiar dual 
nature of the quantum mechanical procedure”. The hypothesis that at some stage during a mea- 
surement the Schroedinger evolution requires to be abruptly suspended (the putative collapse of a 
wave function) and an entirely different physical process takes over whose dynamics is unspecified 
has obvious fundamental difficulties. Apart from the ad hoc way this postulate is grafted onto the 
theory, we are not told at what point the collapse occurs or how long it takes. 

2,2 Nonstandard Approaches 

Inadequacies of all versions of the standard solution to the measurement problem underscore the 
need for exploring schemes beyond the so-called standard viewpoints. Broadly speaking such (non- 
standard) programs can be divided into two classes: approaches accepting the standard formalism 
as it stands (but these introduce new elements into the conceptual framework) that do not require 
the idea of wave function collapse in any form and schemes modifying the standard formalism (in a 
way consistent with empirically verified predictions of the standard formalism) to provide dynamic 
descriptions of wave function collapse as an objective physical process. 

We now discuss three such major (in terms of attention received in the literature) approaches. 
The first two (the many-worlds approach and Bohm’s causal interpretation) belong to the first 
class; in the second class we deal with dynamic models for wave function collapse that incorporate 
stochastic terms into the Schroedinger equation. 

2.2.1 Msiny- Worlds Interpretation 

The many-worlds interpretation (henceforth abbreviated MWI) confronts the measurement prob- 
lem by retaining not only the standard formalism but also as much possible of the standard 
interpretational framework. There are different presentations of MWI, starting from the original 
version in the 1957 paper by Everett [36] . Here we follow mainly the exposition by Squires [37,38]. 

While the standard theory results in the superposition of different outcomes given by the system 
apparatus combined wave function, such as Eq.(2.3) , MWI seeks to eliminate the inconsistency 
between the universal applicability of quantum mechanics and the definiteness of an individual out- 
come by postulating a special relationship between the final wave function [such as Eq. (2.3)] and 
the observer’s state of awareness. Each term in the final wave function is assumed to correspond 
to a definite state of awareness registering a particular outcome. Experiences of all the different 
outcomes are thus considered to be part of the final wave function. This is often expressed graphi- 
cally by saying that as a result of any observation, the “world” branches into different worlds, and 
awareness of each outcome belongs to one world. This is the origin of the name many worlds. How- 
ever, as Squires [37,38] emphasizes, here an interpretation is envisaged incorporating perceptions 
of different outcomes, where each perception involves a relationship between a state of awaren^s 
and a corresponding state of the measured system entangled with the apparatus. Using Everett s 
terminology we summarise by saying that any macro scopically discernible part of the total wave 



884 


Dipankar Home 


function in Hilbert space has a perceptible meaning only in relationship to a frame with reference 
to the observer’s mind. 

At this stage there is a caveat. The final wave function, such as Eq.(2.3) can be expanded in 
ant/ basis. Therefore the basic tenet of MWI must tacitly assume some preferred basis in some 
suitable macroscopic limit. This is noted by several authors [39-43], although the problem is not 
addressed in Everett’s original paper [36]. 

In any of its version, pursuing MWI involves a considerable amount of metaphysical baggage. 
Moreover, there is no strong a priori reason for going beyond physics to seek a solution to the 
quantum meausurement problem. A more attractive approach ( at least to the majority of the 
physicists concerned with the measurement problem) is to obtain a solution from within physics 
unless and until this is shown to be impossible. The most widely discussed two approaches in 
this direction are discussed in the following two sections. One of them (the Bohmian model) 
hinges on assuming that the wave function description of the state of an individual system needs 
to be “completed” by using the concept of “ontological” (i.e., premeasurement and observation- 
independent) values associated with position variables. The other one seeks to provide a dynamical 
description of the process of wave function collapse by accepting the premise that a wave function 
provides a “complete” specification of the state of an individual system. But this approach is 
based on modifying the Schroedinger equation appropriately. For mathematical details of these 
approaches and their applications see, for example, the relevant chapters in Home’s book [8]. 

2.2.2 Bohmian Approach 

Bohm’s causal interpretation reinterprets the standard formalism of quantum mechanics [44-47] 
by introducing ontological position variables (say, x) that with wave function tp provide complete 
specification of the state of an individual system. In this scheme the measurement problem is 
resolved by assuming that all measurements of microphysical attributes ultimately lead to obser- 
vations of the position of some macroobject serving as an apparatus; in other words, all instrument 
outputs in the end are assumed to be readings in position space. This is certainly a reasonable 
assumption, which at least covers a large class of all standard measurements. Even measurements 
of such physical variables as mass, wavelength, time of flight, which are not directly associated 
with Hermitian operators, are ultimately measurements of position; for instance, mass is inferred 
from position in a mass spectrograph, wavelength is obtained from fringe spacing in an interference 
experiment, and time of flight is inferred from position measurements at different instants. 

A definite outcome in an individual measurement is determined by relevant ontological position 
variables which have well-defined values at all instants. Interpreted in this way the intrinsic inex- 
actness of quantum theory is eliminated by ensuring correspondence between the occurrence of a 
definite result and functions of spatial coordinates introduced in the theory at a fundamental level. 
Now recalling the essence of quantum mechanical treatment of a measurement process, note that 
the final wave function of the system apparatus combination is given by an entangled form with 
the spatial separation 5yp between centres of apparatus wave packets cpo corresponding to different 
outcomes. For the purpose of perfect measurement, Syp must attain a value significantly larger 
than the width of the apparatus wave packets <^o, so that these wave packets can be regarded as 
nonoverlapping in position space. 

Each individual outcome is characterised by observing the value of the apparatus variable y. 
However, within the standard interpretation, unless actually observed, y is simply an argument of 
a mathematical function deviod of any ontological significance. On the other hand, the outcome 
of an individual measurement recorded through the value of y is something to which autonomous 
physical reality is ascribed . This is automatically ensured in the Bohmain scheme. 

Since the measurement interaction causes the configuration space wave function to split into a 
set of nonoveriapping apparatus wave packets, in any single run the apparatus particles enter one 
of th^ channels (the Bohmian interpretation legitimizes such terminology by using the notion 
of particle trajectories). According to this point of view, the fact that apparatus particles enter 
one of the pmsible channels leads to a definite measurement result. Thus, in principle, the final 
outcome is causally related to or is uniquely determined by the premeasurement system-apparatus 



On Foundations Of Quantum Mechanics 


885 


wave functions and their initial positions. This is why the Bohmian scheme is often called the 
causal interpretation, 

A critical aspect of the Bohmian position may be briefly mentioned here. The separation of 
the apparatus wave packets occurs in position space and not in, say, momentum space. This 
is conceptually justified, as mentioned earlier, by assuming that all measurements eventually re- 
duce to position determinations (e.g., location of a meter needle, distribution of dots in computer 
print-out). However, it is still formally legitimate to expand the final system-apparatus-combined 
wave function in any basis. Hence for this viewpoint to be entirely consistent, we need a formal 
justification for the emergence of a preferred basis in the macroscopic limit. 

As Bell [2] remarks, it is appropriate to refer to the ontological position coordinates as “exposed 
variables” and to the wave function as a “hidden variable.” It is indeed “ironic that the traditional 
terminology is the reverse of this.” This situation is somewhat conceptually analogous to that in 
classical electrodynamics where the abstract notion of fields has a physical manifestation only by 
the action on charged particles. Similarly it is only by particle attributes recording measurements 
results that we obtain information about a wave function. 

2.2.3 Dynamical Models of Spontaneous Wave Function Collapse 

The two preceding approaches accept the standard formalism and introduce new ingredients only at 
the interpretational level. On the other hand if we address the measurement problem by tinkering 
with the formalism, it is evident from earlier discussions that the modified time evolution must be 
nonunitary and nonlinear, since a pure state is doomed to remain a pure state under any linear 
unitary evolution. As already pointed out, the Dirac-von Neumann idea of measurement-induced 
wave function collapse lacks precision because it assigns a special role to measurement interactions 
without specifying at what point of complexity an interaction establishing a correlation between 
the observables of micro and macrosystems becomes a measurement. 

Though the transition from micro- to macrosystems is gradual in the actual physical world, 
diflferences between the Schroedinger evolution and the Dirac-von Neumann postulated collapse 
dynamics are rather sharp. Hence it is difficult to comprehend how at some point the linear unitary 
Schroedinger evolution are suddenly suspended, allowing the collapse dynamics to take over. It is 
thus clear that a logically coherent scheme for accommodating the notion of wave function collapse 
must have a seamless mathematical description with no dichotomy between measurement and 
other interactions; this also means no arbitrary split between micro- and macrosystems. Hence 
it follows that the collapse process must be spontaneous in the sense of being present in the 
fundamental equation per se, without being induced by an external stimulus, such as the system 
apparatus interaction. In recent years there has been a systematic development of such spontaneous 
dynamical collapse (henceforth abbreviately denoted by SDC) models. Without going into specifics 
of these different versions of SDC [48-51] we discuss their general attributes. 

Additional nonlinear terms are incorporated into the Schroedinger equation, which entail a 
modified time evolution of a system. Such new terms in the Schroedinger equation are postulated 
to satisfy the two diverging desiderata of having a practically negligible effect for all microsystems 
( a necessary requirement due to the extremely high degree of validity of all tested predictions of 
the standard quantum formalism in the microdomain) and of being able to induce an appropriately 
rapid supression of superpositions of macroscopically distinguishable states in the macrodomain. 

Note that this latter feature is required to eliminate quantum mechanical manifestations of 
superpositions of macroscopically distinct states and also to ensure the definiteness of an individual 
outcome. Another aspect of these schemes is that the postulated nonlinearity implies a preferred 
basis of states in ordinary position space, thereby destroying the inherent equivalence between all 
unitarily related different basis of states in Hilbert space. By incorporating these features various 
versions of SDC ensure in the following way that a measurement has a definite outcome. 

A measurement interaction leads to an entangled state entailing a superposition of macro- 
scopicaUy distinct states of a macroapparatus. During this stage the standard linear form of the 
Schroedinger equation matters; nonlinear terms are ineffective in the case of the time evolution 
of a microsystem. Subsequently, the SDC process becomes effective for the macroapparatus, re- 



886 


Dipankar Home 


suiting in the disentanglement of superposition provided the following criterion is satisfied: For 
the occurrence of a definite outcome through the SDC process, relevant macroscopically distinct 
states of an apparatus ( comprising a sufficiently large number of particles) must be localized in 
position space. Such localised states are mutually separated by distances large enough compared 
to a suitable microscopic length scale, usually taken to be ~ Then any superposition of 

such states is intrinsically unstable, reducing rapidly to any one of the superposed states under the 
action of SDC induced by nonlinear terms in the evolution equation. 


2.2.4 SDC Models versus the Bohmian-Type Scheme 

The crucial difference between how these two approaches confront the measurement problem is 
that in Bohm’s scheme the ontological position identifies a definite outcome, but in SDC models 
the actual physical process of reducing a pure state wave function to a mixed state corresponds 
to actualizing a definite outcome. Apart from requiring a wave function to specify completely 
the physical state of an individual system, the SDC framework also implies that an objective 
physical reality is associated with a wave function to account for the definiteness of a measurement 
outcome. Yet the particle ontological aspect of recording a measurement outcome (in the sense 
that in a measurement a particular result is registered in the form of a change in property of some 
localized element composed of what we call particles) is not covered by an SDC model because a 
wave function alone is not sufficient for this purpose. 

We need an additional particle ontological attribute at a fundamental level, which is precisely 
what Bohm incorporated into his model. This specific difficulty cannot be addressed in an SDC 
scheme by simply invoking the formal feature that the wave function is sharply peaked in position 
space after localization, that is, by literally identifying the localized wave packet with a particle. 
Such a point of view has difficulties because of the possibility of spreading of a wave function. 

Another fundamental difficulty inherent in the SDC approach is referred to as the “problem of 
tails”. This problem is absent in Bohm’s scheme. It arises from the fact that in all versions of SDC 
models, a definite outcome emerging from the dynamical localization process is associated with a 
collapsed Gaussian wave function of a macroobject, which though sharply peaked in position space, 
does have nonzero tails extending into far away regions. Thus a definite outcome in terms of a 
definite position is ascribed to an individual system even when its wave function has a nonvanishing 
component (albeit very tiny) on the eigenmanifold associated with a different outcome. 

The above feature has been criticized by a number of authors as undermining SDC models’ 
claim to be a fundamental solution to the measurement problem. The central point of such a 
criticism is that a position being “almost” defined is not the same as a position being defined and 
an object “almost” being in one state or another is not the same as being in one state or another. 
The response from proponents of the SDC approach amounts to the following. 

A definite outwme corresponding to, say, “a pointer pointing at a;i” is specified to occur if 
^ least a certain very large percentage (say, a) of the probability amplitude (but not necessarily 
all of it) is concentrated in the “pointing at xi” sector of the pointer’s configuration space. But 
thm fixing the value of a is arbitrary. There is no unique way of ascertaining whether a should 
be 99% or 99.9%, or other. Moreover, what form of correspondence with physical reality in the 
macrodomain of the (1 — a)% of measurements (yielding no definite outcome at all or, in other 
words, the pointer n^dle having no definite position) signifies is left uncertain. No matter how 
extremdy dose to unity a is chosen, this is certainly a conceptually difficult question for SDC 
modds. Neverthel^, we cannot ignore virtues of the SDC approach which is in line with Bell’s 
aspiration [52] that “one line of development towards greater physical precision would be to have 
the jump (wave function collapse) in the equations” and not just in words so that “it would come 
about as a dynamical process in dynamically defined conditions.” 

Ham to empirically discriminate between the Bohmian approach and SDC-type models is an 
open question. In fact, it is presently one of the most challenging areas of study to try to conceive 
in^ous ideas sugg^ting experimentally realisable situations where such models will give different 
predictions. 



On Foundations Of Quantum Mechanics 


887 


3 Quantum Nonlocality 

The term nonlocality appears in various guises in the literature. Broadly speaking, it denotes 
a form of quantum mechanical action at a distance where a distant influence is counterintuitive 
because of the absence of a classically describable form of physical mediation. At the outset, we 
must emphasize that any discussion of nonlocality does not take off the ground unless we clearly 
define the notion of spatial separation between relevant physical events involving localized entities. 
Yet this defies a precise definition if we remain confined within the standard framework of quantum 
mechanics, where wave functions are only abstract symbols for computing observable probabilities. 

All sets of basis states in Hilbert space and all Hermitian operators have equal status. Therefore 
the very formulation of the problem of quantum nonlocality hinges on ascribing a special role to 
position in ordinary three-dimensional space. The notion of spatial separation in ordinary space 
has of course a fundamental physical significance related to the fact that all known interactions in 
nature decrease rapidly with increasing distance. In discussions of nonlocality it is legitimate to 
take any two localized entities to be sufficiently distant whenever their spatial separation is large 
enough so that they can be regarded noninteracting. Furthermore, a necessary ingredient in this 
entire issue is the idea of a localized measurement; viz., any event characterising a measurement 
at a certain instant occurs at a definite position where a macroscopic detector is localized. All 
manifestations of nonlocality are ultimately discernible through such localized measurements. 

In its most general form a nonlocal effect is thus specified in terms of the state of an individual 
entity being affected by measurements or by any dynamical intervention in a far away localized 
region of space and time (sufficiently distant but not necessarily a spacelike separation) such that 
no known physical interaction or influence (propagating in ordinary space and time) can causally 
connect occurrences in that space and time region to the system in question. As Bell [53] says : 

It is the requirement of locality, or more precisely that the result of a measurement on one system 
be unaffected by operations on a distant system with which it has interacted in the past, that creates the 
essential difficulty, (our italics) 

Quantum nonlocality differs from nonlocal action entailed by classical nonrelativistic theories 
(such as Newtonian gravity, electrostatics, heat diffusion) in that it implies action at a distance 
that does not diminish in strength with increasing distance. Basically quantum nonlocality is kine- 
matic in nature and pertains to cases where correlation properties embodied in the nonfactorisable 
quantum mechanical wave functions (the kinematic component of the theory) are not fuUy repro- 
ducible by a realist theory satisfying the locality condition. This incompatibility is experimentally 
verifiable and it can be theoretically demonstrated by using arguments that do not depend on the 
way a measurement process is described. 


3,1 Bell’s Theorem 

In his famous paper of 1964, Bell [53] formulated a mathematical demonstration of the incompat- 
ibility between quantum theory and a broad class of realist or hidden variable theories satisfying 
the locality condition. Proofs of Bell’s theorem are comprehensively reviewed in various places 
[54-56]. Here we consider a simple version of the proof in the deterministic case. An advantage of 
this proof over Bell’s original proof is that no explicit use is made of the mathematical machinery 
of hidden variable theories (thereby avoiding an assumption about the distribution function of 
hidden variables). 

A typical example of such proofs pertain to two spin-1 /2 particles (1 and 2) in a singlet state, 
flying apart from each other so that spin is conserved. (This example was first suggested_^y Bohm 
[57] for illustrating quantum nonlocality.) Measurements of the components of spins Si and S 2 
along different directions are performed on these particles. For any given parole 1, we measure 
either quantity A or A\ where A = 2*^ • Si and A' = 2^' • (The 't, ? are unit vectors 
in different directions.) Measured values of A, A' are ±1. Similarly for any given particle 2, the 

quantity B or B* is measured , where 5 = 26 • S 2 and B^ ^ • S 2 (J5, = ±1). 

Now consider the combination {AB + A' B -b AB' - A* B'). For any given pair of 1 and 2, we 
can measure only one of the product quantities AB, AB', A'B, A'B'. In each case by construction, 



888 


Dipankax Home 


the answer is +1 or -1. The experiment consists in measuring a large number of pairs , with the 
setting on one side (Particle 1) altered between and a and that on the other side ( Particle 2) 
between and So we have a large number of measurements of each of the quantities AB , AB , 
AB^, and A'5^Basic experimental data are average values of these quantities, which we denote by 
< AB >, and so on. 

We now make the following apparently innocuous set of assumptions : 

1. Each individual outcome of a measurement is causally determined by supplementary variables 
(the so-called hidden variables) that together with ip completely specify the state of an individual 
quantum entity {deterministic realist ingredient). 

Note that the notion of hidden variables in the formulation of Bell’s theorem is quite general. 
Hidden variables are regarded as hypothetical parameters determining the outcomes on an event- 
by- event basis. 

2. The value of a hidden variable of any given entity after preparing the initial state is not 
affected by events in distant regions of space and time (locality condition). 

The conjunction of Assumptions 1 and 2 imply that for each particle there is a definite result 
predetermined for any observable that is obtained if we measure the observable (This is sometimes 
referred to as counterfactual definiteness) . This result is not influenced in any way by measurements 
in the distant regions of space and time. Let us now see how these suppositions lead to a testable 
constraint on correlation functions. (No input from quantum mechanics is used in the ensuing 
argument.) 

Considering the example of two spin-1/2 particles (1 and 2) in a singlet state, it follows from 
assumptions 1 and 2 that the hidden variable associated with each particle 1 fixes definite values 
of both A (+1 or -1) and or -1) that are independent of whether B or B' is measured on 

particle 2. Similarly for each particle 2, predetermined definite values of B and B' are independent 
of whether A or A' is measured. Consequently each particle pair has a value of either +1 or -1 
for each of the quantities AB, A'B, AB', A'B' . For each of the 16 different cases corresponding 
to possible choices of ±1 for each A, A', B, B' separately : 

AB + AB' + A'B - A'B' = ±2 (3.1) 

Note that Eq.(3.1) refers to a single pair (or a hypothetical group of pairs corresponding to the 
same hidden variable specifying their common initial state). The validity of Eq.(3.1) is ensured 
because both the occurences of, say, A in Eq.(3.1) have the same value; similar is the case for 
A',B,3iid B' (locality condition is invoked here). 

Summing Eq.(3.2) over the entire ensemble of pairs and taking the average, we obtain 

|< AB > -h < A5' > -h < > - < A'B' >1< 2 (3.2) 

By virtue of the principle of induction (that the randomly chosen samples of pairs on which 

the quantities such as AB, AB', axe actually measured are typical of the entire ensemble) we 

can then identify the averages < AB >,< AB' with the experimentally measured values of 
these quantities. Thus we have a clear-cut prediction for the actual measured quantities given by 
Eq.(3-2) which is a form of Bell’s inequality. 

The ihequahtyj^3.2) is violated by quantum mechanical results for, say, the singlet state where 
< AB >=_-^ * b = -cos9. The maximum violation occurs when all directions are coplanar, 
with the {a%y) angle equal to 135^ and the three others to 45°. Then the left hand side of Bell’s 

inequality (3.2) is 2.(2)^^^. It is more instructive to take -^ = 't with ^ and 6^ at an angle d on 
each side, so that 

• y = ^ = cosd, ^ • y = cos26 (3.3) 

Then the left hand side of Bell’s inequality (3.2) becomes | -2 - 2cos0(l - cose) |, which is 
greater than 2 for any $ < 90®. 

A r^narkable feature of Bell’s theorem is that irrespective of the specifics of realist models, it 
demonstrates that no realist theory satisfying the locality condition can be fully consistent with 
the quantum mechanical formalism - a classic no-go theorem making it an empirically decidable 
issue whether local realism is tenable in relation to quantum mechanics. This is thus a unique 
instance where by ^pealing to experiments on statistical properties of ensembles, we can draw 
conduacms about assumptions for the putative realist description of individual events. 



On Foundations Of Quantum Mechanics 


889 


The experimental refutation of Bell’s inequality has far reaching implications even for those 
who refuse to speculate about realist theories. This is because the validity of quantum correla- 
tion functions violating Bell’s inequality for macroscopic separations between particles imply an 
irreducible quantum nonseparability in its most acute form. By this we mean the repudiation of 
the cherished notion of divisibility by thought which underlines all of classical physics; viz., any 
composite physical system can be viewed as composed of elements localized in different regions 
of space, so that knowledge of the full Hamiltonian function and values of relevant basic physical 
quantities associated with each constituent should yield ipso facto an exhaustive knowledge of the 
composite system as a whole. 

Before proceeding further we discuss an example [58] of correlation in classical physics for which 
Bell’s inequality is always satisfied. The example pertains to a classical object, initially at rest, 
disintegrating into two spati^y separated fragments 1 and 2 carrying angular momenta Ji and 
J 2 — —J\> The direction of Ji varies randomly over an ensemble of such identical objects. For 
each fragment we consider measurement of the projection si^n_of its angular momentim along a 
certain direction, say, = sign{Ji.~^) for 1 and = sign{J 2 - b ) for 2, where and b are unit 
vectors. Possible values for Va and are ±1. 

Obviousl^if = 6 , we always have ra = so that < raVb >= ^1. Assuming that the 
direction of Ji is uniformly distributed, the probability that VaTb = +1 is ^/tt, and the probability 
that TaTb = —1 is (1 where 9 is the angle between and "^(0 < ^ < tt). It therefore follows 

that < Tart >= — 1, which always satisfies Bell’s inequality (3.2). For instance the left 

hand side of Eq.(3.2) equals exactly 2 for ^ ^ == cos9, and ^ ^ = cos26 [ as 

mentioned earlier in the corresponding quantum case, the left hand side of Eq.(3.2} is larger than 
2 for any 6 < 90*^], indicating that in this classical example the correlation between the angular 
momenta of the two fragments is as strong as possible for a system that obeys locality. In general 
the quantum correlation is stronger than or equal to classical correlation. The form of the angular 
dependence of the quantum correlation function is crucially responsible for the violation of Bell’s 
inequality. 

3.1.1 Bell’s Theorem Using Stochastic Hidden Variables 

The demonstration of Bell’s theorem in the preceding section is in terms of deterministic hidden 
variables. Here we discuss its proof based on a general class of stochastic hidden variable models 
satisfying the locality assumption. (For simplicity and because of its direct relevance to experi- 
mental studies, we follow the Clauser-Horne formulation [59] of the probalistic approach to local 
realism; a more general formulation is given by Selleri [60]. The basic idea of such hidden variable 
theories is that the complete hidden variable description of the source does not uniquely determine 
measured values of local observables pertaining to correlated particles but only probabilities that 
possible values occur. We can thus assume that individual spin component values along different 
directions evolve in time stochastically (independent of any measurement), with the complete state 
of the source controlling only probabilities that particular values will be revealed when subsequent 
measurements axe performed. 

The following argument depends on a source emitting two correlated entities 1 and 2 in opposite 
directions, where two analysers can either transmit or absorb them. The dichotomic choice forced 
in this way on each quantum entity can be used to define corresponding dichotomic observables 
by prescribing thatA(a) = ±1(J?(6) = ±1), depending on the choice of transmission (+1) or 
absorption (-1), pertaining to 1(2). This scheme is relevant to actual experiments testing Bell’s 
inequality with pairs of correlated optical photons emitted in atomic cascades. For such photons 
the binary choice is between transmission and absorption in a polarizer. 

Let us invoke a variable to represent the complete physical state of an individual pair of cor- 
related quantum entities (1 and 2) within a general probabilistic scheme in which pi(a,A) is the 
probability that an individual Particle 1 in the state A crosses the analyser with parameter a, then 
is subsequently detected; P2(5, A) is the similar probability for 2; P(a,6,A) represents the joint 
probability that both 1 and 2 cross their respective analysers with parameters a and b and both 
are detected. The locality condition is expressed by the following factorability condition : 



Dipankar Home 


8M 


P(a,6,A)=pi(a,A)p2(6,A) (3.4) 

with the obvious requirement that the hidden variable distribution function p{X) corresponding 
to the initial joint state of 1 and 2 and the domain of possible values of A do not depend on 
parameter choice for the analyzers. Note that quantities P(a, 6,A), pi(n, A), p2(h, A) are defined 
at the individual level. Observable probabilities at the statistical level (quantum mechanically 
computable quantities) are expressible as weighted averages of individual probabilities : 

Pi (a) = I Pi X)p{X)dX (3.5) 

p^{b) = fp2{bA)pWdX (3.6) 

P(a,6 )=/P(a, 6,A)p(A)dA (3.7) 

To deduce testable constraints on j>i(a), P 2 {b), P(n,6) from the locality condition (3.4) using 
Eqs. (3.5)-(3.7) we use the following algebraic theorem. 

Given real numbers zi , X2, AT, such that 

0<xuX2<X 0<yuy2<Y (3.8) 

it follows that 

-XY < xiyi “ xiy 2 4- x^yi 4- X 21/2 - X 2 Y - Xyi < 0 (3.9) 

The proof of Eq.(3.9) is straigthforward : Since the intermediate quantity in Eq.(3.9) is linear in 
each of the four variables xi , xt^yiand t/2, we seek its extreme on the boundary of these variables. 
Let us takexi =pi(aiA), X2 = Pi(a2t A), |/i =P2(i^i,A), 1/2 =^P 2 {b 2 iX) and assume that ontological 
probabilities pi,2 at the individual level (not directly obs€r\^ble) lie between 0 and 1, so that 
AT = y = 1. 

Then Eq.(3.9) reduces to 

A)p2(5i, A) pi(aiT A)p2{h2, A) 4-pi(a25 A)p2(5i, A) 
4-pi(a2,A)p2{62,A) •~Pi(a2,A) ~p2(^i,A) < 0 (3.10) 

By invoking the locality condition (3.4) we can -write 

-1 < P(ai,6i,A) -P{ai,52,A) -f P(a2,5i,A) 

4-P(a2,62,A) ~ Pi (02, A) '~P2 (^i,A) < 0 (3.11) 

Multilplying by the probability density p(A), integrating over A, and using Eqs. (3.5)-(3.7), we 
obtain 

— 1 < P(ai,6i) — P(ai,62) 4- P{ci2^bi) 

^P{a2M) -Pi(a2) ~P2(6 i) < 0 (3.12) 

which is a form of BelFs inequality knowm as inhomogeneous inequality, since it is based on 
both double and single detection probabilities. 

An important ingredient in the preceding proof is the condition that the probabilities pertaining 
to the distribution of hidden variable are positive, and not larger than than 1. We may argue 
that since such probabilities are not directly observable, it should not be objectionable to invoke 
negative probabilities for hidden variables. Then of course Bell’s inequality is no longer deriable for 
stochastic hidden variable models. In fact there are ©q)licit examples [6 1,62] of stochastic hidden 
variable models that reproduce quantum mechanical violations of Bell’s inequality at the expense 
of aHowing negative probabilities. 

General arguments [63,64]show that we can always reproduce quantum mechanical results for 
nonfactorable state vectors of correlated systems by stodiastic hidden variable models using nega- 
tive probabilit!^. Howewr, we stress that to formulate hidden variable models, it is conceptually 
lil^timate to use nonphysical negative probabilities even though computed testable probabilities 
are msmed to be pc^itive and physically meaningful. This is because, within the firamework of a 
hiddm variable model, quantity interpreted as probabilities at the hidden (individual) level have 
the »iae physical and cmtologicaJ status as probabilities actually measurable at the ensemble level 
through firequeides of repeated events. 

Diflerenl Local Realist lu'equalities 

Bdl’s '.iHKp^ty Is one of many inequalities that can be deduced from local realism. 

Tl» fimt of inequaliti^ providing physical restrictions not contained in Bell’s inequal- 

ity pven by Roy and Singh [65]. They showed, for example, that local realism implies 

S ^ijPi^s^bj) < 6 (3.13) 



On Foundations Of Quantum Mechaxdcs 


891 


where 

/II 1 0 1 \ 



V 1 -1 0 0 0 J 


Suppose for example that 

P(a,-,6,-) = |[2 + Ci,(l-C'y)] (3.14) 

SO that P(aj,6j) = 0 if Cij = -1 and P{aj,bj) = (2/3) otherwise. We see that Eq.(3.13) is 
violated, since its left-hand side equals (20/3). 

Any Bell combinations of four of these correlation functions can instead take only the values 
(6/3), (4/3), or 0; thus none of the corresponding inequalities is violated. The set of 20 values 
(3.14) of P{aj,bj) violates local realism, although it satisfies all conceivable Bell inequalities. 
Garuccio and Selleri [66] could show that given mumerical coefficients Cij to be real but oth- 


erwise arbitrary and correlation functions P{aj^bj) with i = l,....n and j = l,....m, local realism 
implies 

-Mo < EEcijPiaj.bj) < Mo (3.15) 

with 

Mo = Max^^r, (E E (3-16) 

where among ail possible choices of the sign factors = i:l (2 = 1, n) and rjj = ±l{j = 


1, ....m) we must take the one corresponding to the maximum value of the quantity within paren- 
thesis in Eq.(3.16). A detailed account of this development with various proofs is given in Garuccio 
[ 67 ]. 

3.1.3 Quantum Violation of Local Realism in the Macroscopic Limit 

The study of quantum nonlocality embodied in the many-particle entangled states assumes an addi- 
tional significance when examining whether such nonlocality persists in the appropriate macrolimit 
Mermin[68] initiated investigations on this topic, followed by Roy and Singh [69], who show that 
the quantum mechanically predicted violation of the N-particle local realist inequality persists for 
large values of N - in fact, the magnitude of such a violation grows exponentially for large N. 

Home and Majumdar [70] showed that there exist entangled N-particle states (each particle 
having arbitrary spin-j) whose quantum mechanical expectation value of a hermitian operator 
exceeds the classical realist bound; the violation persists for large values of N and j. All such works 
help to reinforce the view that if a wave function preserves its quantum character in the macroscopic 
domain, even large quantum numbers do not guarantee the disappearance of nonclassical effects. 
In the present case the many-particle entanglement in a wave function is responsible for large scale 
violation of classical realism in the domain where both the relevant parameters, quantum number 
j and number of particles N are arbitrarily large. 

Note that the type of macroscopic quantum effect alluded to in this section which shows ir- 
reducible quantum nonlocality in the appropriate macrolimit is different from the macroscopic 
quantum effects stemming from a coherent superposition of macroscopically distinguishable states 
where each state is an entangled many-body wave function corresponding to the state of a macro- 
scopic object as a whole. 

3.1.4 Experimental Tests : Present Status and Some Recent Suggestions 

When assessing the experimental tests of Bell-type inequalities, we must distinguish between the 
following two types of inequalities, 

* Weak Inequality : An inequality deduced from local realism alone, which is incompatible with 
quantum mechanical results in the case of ideal experiments with highly efficient detectors. 

* Strong Inequality : An inequality deduced from local realism with an additional assumption, 
which is violated by quantum mechanical results in the case of actual experiments using the 
available low-efficiency detectors. 

Considering typical experimental arrangements employed when testing correlations between po- 
larizations on photons, a standard form of BelPs inequality for this purpose is given by Eq.(3.12). 



892 


Dipankar Home 


This equation follows from local realism without an additional assumption. How^ever inequalities of 
the type in Eq.(3.12) are not violated by the quantum mechanical predictions for available photo- 
multiplier detectors in the visible wavelength range, with low efficiencies 2^20-30%. For this reason 
one common procedure (from Clauser and Horne [59]) is to invoke the following supplementary 
assumption : For every photon in the state specified by a hidden variable A , the probability of 
detection with a polarizer placed before a detector is less than or equal to the detection probabilit} 
with the polarizer removed. (This is often referred to as the “nonenhancement’ assumption). 

Note that this is a nontrivial postulate at the level of individual photons with a particular A 
(for an in-depth analysis of the nontrivial nature of this hypothesis in the context of local hidden 
variable models, see Selleri [71], Ferrero et al.[72], and Santos [73]), though it is of course true at 
the observable level of statistical results when averaged over all As. 

This additional hypothesis means the following inequalities 

Pi(ai,A) <Pi(oo,A) Pi (02, A) <Pi(oo,A) 

P2(5i,A) < P 2 (oo, A) P 2 (& 2 ,A) < p 2 (oc, A) (3.17) 

where the symbol oc indicates that the polarizer was removed. 

We use Eq.(3.9) with X = pi(oo,A), Y = P 2 (oo,A) to obtain the following version of strong 
inequality 

*“^0 < H(ai,bi) - P(oi, 62 ) + -^( 02 , 5i) 

+P(a 2 ,b 2 ) - P(a 2 ,oo) ”P(oo,&i) 

< 0 (3.18) 

where Do is the double-detection probability in the absence of intervening polarizers and 
P{a 2 ,oo), P(oo,5i) are measured probabilities with the second and the first polarizer removed, 
respectively. At this stage a numerical example helps us appreciate the difference between inequal- 
ities (3.12) and (3.18). 

We define 


r = P(oi, 6 i) - P(ai, 62 ) +P(a 2 , 6 i) + P(a 2 ,& 2 ) (3.19) 

Then the inequalities (3.12) and (3.18) can, respectively, be written as 

“1 -i“Pi(o 2 ) +P 2 ( 6 i) < F < Pi((o 2 ) 4-p2(6i) (3.20) 

J[>o + P(^ 2 ,oo) + P(oo, 6 i) < r < P(o 2 ,oo) + P(oo, 6 i) (3.21) 

These should be compared with the quantum mechanical results 


P(ai,bi) = (i) [e\£\ + £l£lcos2{ai - (3.22) 

Pi(ai) = 4t- P2(^i) = 4f 
Do = Thm P{a2,oo) = 

P(oo,6i) = ^5^f^ (3.23) 

where rti, % axe ^ciendes of the two photomultiplier detectors on both sides and 4’^ 
parameters determining effidendes of polarizers 1 and 2 on both sides corresponding to binary 
chdc£s between transmission (+) and absorption (-) of a photon when passing through a polarizer. 
Eecalling that the typical numerical values of experimental parameta-s are 

4 ~ 4 ~ ^ ^ 


% = % ~ 0.2 - 0.3 (3.24) 

Inequalities (3.20) and (3.21) reduce, respectively, to 

-0.8 < r < 0.2 (3.25) 

0 < r < 0.08 (3.26) 

Local realism ^ves the set of possible values of F a spread of about 1.0, while the additional as- 
sumi^km reduces this figure to about 0.08. Using Eq.(3.19),quantum mechanical results (3.22), and 
actual experimental parameters (3.24) , we can compute (r)„i„, which leads to (r),„i„ = -0.00138. 
This is consistent with the weak inequality (3.25) , but it violates the strong inequality (3.26). It is 
tberrfore arguable that this obs^ved disagreement stems essentially from the additional postulate 
jsed in deriving die stnmg inequality, a postulate that could then be interpreted as inapplicable at 
the levd erf hidden vmiables As. Ln fact expiidt local hidden variable models (albeit ad hoc) are for- 
mulated that rqwoduce quantum medianical results (3.22) and (3.23) for the actual experimental 
parameters (3.24) but at the expeise of violating the additional assumption [74, 75]. 



On Foundations Of Quantum Mechanics 


893 


Since this assumption at the level of unobservable As cannot be tested directly, one way of 
making further progress is to subject its various implications to appropriate scrutiny. 

Apart from using atomic cascade sources, a number of tests of Bells’ inequality have been 
performed [76,77] using the parametric down-conversion technique for producing photon pairs 
correlated in polarization. (A laser beam produces a pair of degenerate down-converted photons in 
a nonlinear crystal of potassium dihydrogen phosphate.) Again these experiments have the same 
low-efBciency problem mentioned earlier. 

Beacuse of the importance of these conceptual issues we need improved tests of Bell-type 
inequalities based on various quantum systems. Examples involving the decay of a = 1 state 
( for example spin-1 ^ reson ance or spin-1 T(^) vector meson) into a pair of neutral pseudoscalar 
meson-antimeson - M^(for example — have been investigated by various authors [78-83]. 
Bell-Type Inequalities 

3.2 Demonstration of Quantum Nonlocality without Using Bell- Type 
Inequalities 

Greenberger, Horne, and Zeilinger [84] (henceforth GHZ) initiated a new line of study of quantum 
nonlocality by demonstrating an incompatibility between quantum mechanics and local realism in 
a nonstatistical way without using Bell-type inequalities. In its simpliest version, the argument 
develops by using a system of three spin-(l/2) particles, mutually correlated and spatially sepa- 
rated; on each of these measurements of x or y spin components are considered. Let the composite 
state be represented by (referring to z-axis components) 

I ^ >= (I 1, 1, 1 > - I -1, -1, -1 >) (3.27) 

Here | 1, 1, 1 > and | — 1, — 1, — 1 > denote states for which cr^ eigenvalues of the three particles 
are all = -}-lor -1, respectively. It is easily verifiable by direct calculation that | ^ > given by 
Eq.(3.27) satifies the following eigenvalue equations 

^144 I ^ ~ I $ > (3.28a) 

alalal | $ >=| $ > (3.28b) 

alM !$>=!$> (3.28c) 

j $ >=| $ > (3.28d) 

where superscripts 1-3 designate the particles, respectively. In this example it is possible to 
determine any ....by distant measurements on the other two particles (e.g., to know crj, 

we need only measure and cTy at distant locations). Applying the notion of local realism it may 
thus seem legitimate to assume that observed individual values of cr-operators are predetermined 
(by, say, hidden variables) and that any such individual value is independent of whichever sets of 
three single-particle spin measurements we choose to make on these spatially separated particles; 
we call these values 

Consistent with Eq.(3-28a-d) we then have the following relations for any set of three such 
correlated particles whose m values are determined by fixed hidden variables: 


rn\ml.rn% = — 1 

(3.29a) 

mlmhn^ = +1 
= +1 
= +1 

(3.29b) 

(3.29c) 

(3.29d) 


Note that in the preceding relations the notion of local realism implies that the individual value 
of any one of the quantities is the same irrespective of the equation in which it occurs [e.g., 

the value of ml is the same in Eq.(3.29a-d)]. Then Eq.(3,29a-d) is not mutually consistent, since 
from Eq.(3.29a-d) we obtain (recall that = ±1): 

mlmlml = +1 

which contradicts Eq. (3.29a). This demonstrates that for multiparticle states of the type in 
Eq.(3.27) , quantum mechanical predictions are incompatible with local realism. The chief merit 
of this form of argument is its nonstatistical nature : We are not concerned with measurement 
statistics involving questions about the size of the relevant ensemble, statistical fluctuations, and 
so on. The GHZ argument reveals that the notion of local realism is inconsistent with quantum 



894 


Dipankar Home 


mechanics even in the case of perfect correlations (i.e, even at the maximum correlation angles 
5 = 0 and 5 = TT for which quantum mechanics makes nonprobabilistiCj definitive predictions 
regarding the correlation properties). 

Bell-type arguments demonstrate an incompatibility for imperfect correlations only when the 
specific form of the correlation as a function of the relative orientation between measurement 
axes plays a central role. The GHZ argument is also concerned with measured values only along 
orthogonal axes (say, x or y spin components) whereas w’-e require measurements along various 
nonorthogonal directions to demonstrate quantum violation of Bell’s inequality. 

There exist other arguments as well which demonstrate, independent of Bell-type inequalities, 
that local realism and quantum mechanics are mutually incompatible. See, for example, [85]. 

3.3 Quantum Teleportation 

A recent scheme suggested by Bennett et al. [86] illustrates quantum nonlocality in a rather striking 
w'ay involving an interplay betw^een entanglement of states (kinematics) and measurement-induced 
effect on such an entanglement. This approach enables the state of a given particle to be pre- 
pared identical to the state of a distant particle with w^hich no direct interaction occurs. Of late, 
this curious w^ay of transporting the quantum state of an object without requiring the object it- 
self to be transported is attracting much attention, particularly because of recent claims to have 
experimentally realized this process [87-89]. 

Classically, moving an object means moving all the particles it is made of. However, an “object” 
in quantum mechanics is solely characterized by the quantum state of the particles it is made of. 
Thus reconstructing the quantum state of a particle out of the state of a diatant particle is a form 
of “transportation” of an “object” in a quantum mechanical sense. 

In the method of quantum teleportation, a pair of given quantum systems (designated by, say, 
2 and 3) is known to be prepared in an entangled state D and distributed between two spatially 
separated partners, say, Alice and Bob, Suppose Alice has also a test system (designated by, say, 
1) in a particular state that may be unknown to her. This approach transfers the state of System 
1 itself. The correlated state D of Systems 2 and 3 provides the quantum channel crucial for this 
process. Of course, we cannot preserve the original state 1 intact - it changes in the process. Also 
note that this form of teleportation of a quantum state cannot talce place instantaneously or across 
spacelike separation because apart from the quantum correlation, it requires sending a classical 
message (say, a telephone call) from Alice to Bob. 

We explain the method by taking all three systems concerned to be spin- 1/2 particles and D 
to be the singlet state given by 

I 'hz >= (It>2i-1.>3 - |4->2|t>3) (3.33) 

where lt>i and 14.>i are eigenstates of the spin operator crj and subscripts 2 and 3 label 
particles that sure with Alice and Bob, respectively. The state of particle 1 with Alice, which is to 
be transported, is denoted by | ^ >i, written as 

j > 1 = a lt>i +6 |i>i (3.34) 

where | a p + | & |^= 1. j >i is supposed to be unknown. 

TTie initial state of the entire system comprising Particle 1 and the correlated pair 2+3 is given 
by 

\i)m>=\<t>>i\rhz> (3.35) 

Using Eqs. (3.33) and (3-34) , we can rewrite Eq.(3.35) as: 

I t^’123 >= (lt>l |t>2l4^>3 “ lt>l|-l->2tt>3) 

+ (2^(l4->l|t>2l-l.>3- |-l->l|4->2|t>3) (3.36) 

Next vre write each direct product state of 1 and 2 in terms of the following basis vectors ; 


i >= (l/2)^^^(|t>il4->2 + |4->ilt>2) (3.37a) 

1 ^12 >= (1/2)1/2(||>j||> 2 _ |4.>,jf>2) (3.37b) 

I 4>n >= (l/ 2 )^^^(lt>ilt >2 + i4->it4.>2) (3.37c) 

1 ^12 >= (l/2)^''^(|t>i|t>2 - !4->i|i>2 (3.37d) 



On Foundations Of Quantum Mechanics 


895 


which form a complete orthogonal basis for 1 and 2. Using Eq.(3.37a~d), | ^123 > given by 
Eq.(3.36) can be expressed as: 

I >= (I) [(-a |t>3 \i>3) I ^12 > 

+ (“0 lt>3 -b [|>3) I ^^2 > |4'>3 -b |t>3) I (Pt2 > 

Ha ||>3 +6 |t>3) I 4>x2 >] (3.38) 

If Alice performs measurements on the joint system of 1 and 2 to distinguish between states 
I >5 1 ^f2 ^hen the final combined state is given by 

I ^ >= (l/2)[(^a |t>3 +6 \i>3) I >1 > 

+ (“<2 |t>3 -b |4.>3) 1 >1 ^2 > 

+ (a |i>3 -6 lt>3) I <^12 ^3 > 

+(a 4>3 +b |t>3) I >1 A4 >] (3.39) 

where \ Ai >, ( .^2 >, Ms >, M4 > are mutually orthogonal states corresponding to different 
distinguishable outcomes ; Eq.(3.39) is thus an incoherent sum of four terms thereby indicating dis- 
entanglement of the state | ^23 >given by Eq.(3.33). It therefore follows that for each measurement 
outcome , Bob’s Particle 3 is left in one of the four equally likely states 



(3.40) 


Recalling that the state M >1 in Eq.(3.34) to be transported is given by 


a 

b 


then each of 


the states in Eq.(3.40) is unitarily related to ^ J by a rotation of tt around the x, y, z axes. [In 

the case corresponding to the apparatus state \ A 2 >, no such. rotation is required because the state 
of 3 is the same as Eq.(3.34) except for an irrelevant phase factor,] Thus whatever be the outcome 
of measurement performed by Alice, teleportation of state to particle 3 with Bob is achieved in 
each case by transmitting the measurement result to Bob through an ordinary classical channel, 
such as by a telephone call. Afterward Bob applies the required rotation to transform the state of 
his particle into a replica of the original state (3.34) of Particle 1. At the end of this process, Alice 
has no trace of the state in Eq.(3.34); instead particles 1 and 2 are left in one of the joint states 
\i’i 2 > or \(l>f 2 > ■ 

Note that in the argument demonstrating the possibility of quantum teleportation it is not 
necessary to assume wave function collapse; for convenience, many authors discuss teleportation 
by explicitly invoking the hypothesis of collapse. Incoherence between the four terms in Eq.(3.39) 
(due to mutual orthogonality of the apparatus states) suffices to ensure that Particle 3 is effectively 
left in one of the required states enabling the teleportation to be accomplished. We emphasize that 
quantum correlations embodied in the joint wave function of 2 and 3, together with appropriately 
chosen measurements on 1 and 2, result in disentanglement of the entangled wave function of 2 
cuid 3 which leads to the predicted teleportation of a quantum state. 


4 Concluding Remcirks 

For the sake of brevity we have refrained from discussing here a number of significant areas of 
research on the foundational issues of quantum mechanics. For example, we have not dwelled on 
the recent investigations throwing new light on wave - particle duality and Bohr’s complementarity 
principle, theoretical and experimental studies uncovering new types of quantum effect such as the 
Quantum Zeno Effect which has far reaching conceptual ramifications. For elaborate discussions 
of these topics see the relevant chapters in Home’s book [8]. 

Studies in recent years have not only sharpened our understanding of the conceptual inadequa- 
cies of the standard model of quantum mechanics but, more importantly, a number of alternative 
approaches have been developed which seek to alleviate deficiencies of the standard interpretation. 
As discussed in this article, such nonstandard schemes are not just qualitative, but are also subject 
to precise quantitative formulations. The thrust area at this stage is to investigate the possibility 
of their testable predictions beyond the standard framework and whether some of these models 



896 


Dipankax Home 


mav ba empirically discriminated from one another. Of course, conceptual as well as mathemat- 
ical refinements of these schemes are necessary in order to reach a stage when decisive empirical 
judgements will be possible. 

Studies to this end not only deepen our understanding of the “vulnerable” areas of quantum 
mechanics but also reveal hitherto unexplored facets of the theory. For instance, the empirically 
relevant manifestations of quantum entanglement that have been discovered related to quantum 
nonlocality have of late acquired considerable significance in view of their applications in the context 
of cryptography, quantum communications and quantum computing [90]. Novel approaches have 
been developed to gain fresh insights into the quantum measurement problem w^hich will make it 
empirically investigable, like biomolecular systems being used as mesoscopic quantum measuring 
devices [91]. Ingenious new models of quantum mechanics have also been proposed which have the 
potentiality of yielding predictions not contained within the standard framework [92]. 

To conclude, we may quote Bell [93] : “It seems to me possible that the continuing anxiety 
about what quantum mechanics means or entails will lead to still more tricky experiments which 
will eventually find some soft spot” . 


References 

[1] A.Shimony in The New Physics, edited by P.C.W. Davies, Cambridge University Press, Cam- 
bridge (1989). 

[2] J.S.Beli, in Speakable and Unspeakable in Quantum Mechanics (Cambridge University Press, 
Cambridge, 1987), pp. 120-21. 

[3] A.J.Leggett, in Quantum Implications ( B.J. Hiley, k D. Peat, eds.)(Routledge & Kegan Paul, 
London, 1987), pp. 85-104. 

[4] B.d’Espagnat, Veiled Reality - An Analysis of Present -Day Quantum Mechanical Concepts 
(Addison - Wesley, Reading, 1994). 

[5] J.T.Cushing, Quantum Mechanics - Historical Contingency and the Copenhagen Hegemony 
(University of Chicago Press, Chicago, 1994), chap.3. 

[6] A.Whitaker, Einstein, Bohr and the Quantum Dilemma (Cambridge University Press, Cam- 
bridge, 1996). 

[7] M.Namiki, S.Pascazio, and H.Nakazato, Decoherence and Quantum Measurement (World Sci- 
entific, Singapore, 1997). 

[8] D.Home, Conceptual Foundations of Quantum Physics - An Overview from Modem Perspec- 
tives (Plenum Press, New York, 1997). 

[9] P.Mittelstaedt, The Interpretation of Quantum Mechanics and the Measurement Process 
(Cambridge University Press, Cambridge 1998). 

[10] S.Weinterg, Drmms of a Final Theory, (Vintage, London, 1993), p. 64. 

[11] J.S-Bdl, in Speakable and Unspeakable in Quantum Mechanics (Cambridge University Press, 
Cambridge, 1987), p. 125. 

[12] E.P.irigna-, Am. J. Phys. 31, 6 (1963). 

[13] AJ.L^ett, in Lesson of Quantum Theory (J. de Boer, E. Dal, and 0. Ulfbeck, eds.) (Elsevier, 
Am^^dam, 1^), p. 47. 

[14J N.Mnr, DmhcUm 2, 312 (1948). 

[15J W.Hei^iiberg, Physical Principles of the Quantum Theory (University of Chicago Press, 
CMc^o, 1930; reprinted Dovm:, New York), p. 64. 



On Foundations Of Quantum Mechanics 


897 


[16] J.S.Beil, in Speakable and Unspeakable in Quantum Mechanics (Cambridge University Press^ 
Cambridge, 1987), p. 124. 

[17] W.Heisenberg, in Niels Bohr and the Development of Physics (W. Pauli, ed.)(Pergamon Press, 
Oxford, UK, 1955), pp.12-29. 

[18] W.Heisenberg, Physics and Philosophy (Harper and Row, New York, 1962), chap.3. 

[19] W.H.Zurek, Prog. Theor. Phys. 89, 281 (1993). 

[20] K. Gottfried, Phys. World 4, 10, 34 (1991). 

[21] A.J.Leggett, Contemp. Phys. 25, 583 (1984). 

[22] A.J.Leggett, in Proc. 1st Int Symp. Foundations of Quantum Mechanics in the Light of New 
Technology (Physical Society of Japan, Tokyo, 1984); Prog. Theor. Phys. Suppl. , no. 69, 80 
(1980). 

[23] R.Omnes, Interpretation of Quantum Mechanics (Princeton University Press, Princeton, NJ, 
1994), chap. 7. 

[24] N.G.van Kampen, Physica A, 153, 97 (1988). 

[25] A.J. Leggett, Curr. Sci. 67, 785 (1994). 

[26] D.Home and M.A.B.Whitaker, Phys. Rep.210, 223 (1992), sects. 5.4 and 6.2. 

[27] J.S.Beil, Phys. World 3, no.8, 33 (1990); also in : Sixty- Two Years of Uncertainty (A.J. Miller, 
ed.) (Plenum, New York, 1990), pp. 17-31. 

[28] D.Home and M.A.B.Whitaker, Phys. Rep.210, 223 (1992), sects. 5.4 and 6.2. 

[29] R.B.Griffiths, J. Stat. Phys. 36, 219 (1984); in New Techniques and Ideas in Quantum Mea- 
surement Theory (D.M. Greenberger, ed.) (New York Academy of Sciences, New York, 1986); 
Found. Phys. 23, 1601 (1993). 

[30] R.Omnes, J. Stat. Phys. 53, 893 (1988); Rev. Mod. Phys. 64, 339 (1992). 

[31] M.Gell-Mann and J.B.Hartle, in Complexityj Entropy, and the Physics of Information (W. 
Zurek, ed.) (Addison- Wesley, Reading, MA, 1990); in Proc. 3d Int. Symp, Foundations of 
Quantum Mechanics in the Light of New Technology (S. Kobayashi et al., eds.)(Phys. Soc. of 
Japan, Tokyo, 1990); Phys. Rev. D 47, 3345 (1993). 

[32] B.d’Espagnat, J. Stat. Phys. 56, 747 (1989); Found.Phys. 20, 1147 (1990). 

[33] . M.Gell-Mann, Quark and the Jaguar (Little Brown and Co., London, 1994), pp. 153-54. 

[34] P.A.M.Dirac, in Electrons et Photons - Rapports et Discussions du Cinquieme Conseil de 
Physique tenu a Bruxelles 1927 (Gauthier- Villars, Paris, 1928). 

[35] P.A.M.Dirac, Principles of Quantum Mechanics (Clarendon, Oxford, UK, 1930), p.36 

[36] J.von Neumann, Die Maihematische Grundlagen der Quantenmechanik (Springer- Verlag, 
Berlin, 1932); English translation : Mathematical Foundations of Quantum Mechanics (Prince- 
ton University Press, Princeton, NJ, 1955). 

[37] H.Everett, Rev. Mod. Phys. 29, 454 (1957). 

[38] E.J.Squires, in Quantum Theory without Reduction (M.Cini and J.M.Levy-Leblond, eds.) 
(Adam-Hiiger, Bristol, UK, 1989). 

[39] E.J.Squires, Synihese 97, 109 (1993). 



898 


Dipankar Home 


[40] B.S.Dewitt and N.GraJiam, Many-Worlds Interpretation of Quantum Mechanics (Princeton 
University Press, Princeton, NJ, 1973), pp. 155-65. 

[41] L.E.Ballentine, Found. Phys. 3, 229 (1973). 

[42] D.Deutsch, Int J. Theor. Phys. 24, 1 (1985). 

[43] M.Lockwood, Mind, Brain, and the Quantum (Basil Blackwell, Oxford, UK, 1989), chap. 13. 

[44] D.Albert and B.Loewer, Synthese 77, 195 (1988). 

[45] P.Holland, The Quantum Theory of Motion (Cambridge University Press, Cambridge, 1993). 

[46] D.Bohm and B.J.Hiley, The Undivided Universe (Routledge, London, 1993). 

[47] J.T.Cushing, Quantum Mechanics - Historical Contingency and the Copenhagen Hegemony 
(University of Chicago Press, Chicago, 1994). 

[48] Bohmian Mechanics and Quantum Theory : An Appraisal (J.T.Cushing, A. Fine and 
S.Goldstein, eds.)(Kluwer Academic Publishers, Dordrecht, 1996). 

[49] G.C.Ghirardi, A.Rimini and T.Weber, in Quantum Probability and Applications (L.Accardi, 
and W.von Waldenfels. eds.) (Springer- Verlag, Berlin, 1985), pp. 223-32. 

[50] G.C.Ghirardi, A.Rimini and T.Weber, Phys. Rev. D 34, 470 (1986). 

[51] E. J.Squires, Phys. Lett. A 158, 431 (1991). 

[52] P.Pearle and E.J.Squires, Phys. Rev. Lett. 73, 1 (1994). 

[53] J.S.Bell, in Speakbable and Unspeakable in Quantum Mechanics (Cambridge University Press, 
Cambridge, 1987),p.ll7. 

[54] J.S.Bell, Physics 1, 195 (1964); reprinted in 5peaA:a6Ze and Unspeakable in Quantum Mechanics 
(Cambridge University Press, Cambridge, UK, 1987), p.l4. 

[55] J.F.CIauser and A.Shimony, Rep. Prog. Phys. 41, 1881 (1978). 

[56] Quantum Mechanics Versus Local Realism : The Einstein-Podolsky-Rosen Paradox (F.Selleri, 
ed.) (Plenum, New York, 1988). 

[57] D.Home and F.Selleri, Riv. Nuov. Cim. 14, no. 9, 1(1991). 

[58] D.Bohm, Quantum Theory (Prentice-Hall, Engle hood .Cliffs, NJ, 1951), p.614. 

[59] A.Peres, Am. J. Phys. 46, 745 (1978). 

[M] J.F.CIauser and M. A. Home, Phys. Rev. D 10, 526 (1974). 

[61] F.Selleri, in Microphysical Reality and Quantum Formalism (A. van der Merwe et al. 
^,)(Kluwer, Dordrecht, Netherlands, 1988). 

[62] J.D.Ivanovic, Lett. Nuov. Cim. 22, 14 (1978). 

[63] W.Mu<±enheim, I^U. Nuov. Cim. 35, 300 (1982). 

[64] D.Home and V.L.Lepore, and F.Selleri, Phys. Lett. A 158, 357 (1991). 

[65] G.S.Agar5val, D.Home and W.Sdileich, Phys. Lett. A 170, 359 (1992). 

[66| S.M.Roy and V.Singh, J. Phys. A 11, L16r (1978); J. Phys. A 12, 1003 (1979). 

[67] A.Garuccio and F.Selleri, Found. Phys. 10, 209 (1980), 



On Foundations Of Quantum Mechanics 


899 


[68] A.Gaxuccio, in Quantum Mechanics versus Local Realism : The Einstein-Podolsky-Rosen 
Paradox (F.Selleri, ed.) (Plenum, New York, 1988). 

[69] N.D.Mermin, Phys. Rev, Lett 65, 1838 (1990). 

[70] S.M.Roy and V.Singh, Phys. Rev. Lett. 67, 2761 (1991). 

[71] D.Home and A.S.Majumdar, Phys.Rev.AB2, 4959(1996). 

[72] F.Selleri, Quantum Paradoxes and Physical Reality (Kluwer, Dordrecht, Netherlands, 1990). 

[73] M.Ferrero, T.W.Marshali and E.Santos, Am. J. Phys. 58, 683 (1990). 

[74] E.Santos, Phys. Lett A 139, 431 (1989); Found. Phys. 21, 221 (1991); Phys. Lett. A 212, 10 
(1996). 

[75] E.Santos, T.W.Marshall, and F.Selleri, Phys. Lett. A 98, 5 (1983). 

[76] D.Home and T.W.Marshall, Phys. Lett. A 113, 183 (1985). 

[77] Z.Y.Ou and L.Mandel, Phys. Rev. Lett. 61, 50 (1988); Y.H.Shih and C.O.AUey, Phys. Rev 
Lett. 61, 2921 (1988). 

[78] Z.Y.Ou, C.K.Hong and L.Mandel, Opt Commun. 67, 159 (1988); T.E.Kiess, Y.H.Shih, 
A.V.Sergienko, and C.O.AUey, Phys. Rev. Lett. 71, 3893 (1993). 

[79] J.Six, Phys. Lett. B 114, 200 (1982). 

[80] F.Selleri, Nuov. Cim. Lett 36, 521 (1983). 

[81] A.Datta and D.Home, Phys. Lett. A 119, 3 (1986). 

[82] D.Home, in Proc. 3rd Int. Symp. Foundations of Quantum Mechanics (S.Kobayashi et al., 
eds.) (Physical Society of Japan, Tokyo, 1990), pp. 43-50. 

[83] A.Datta and D.Home, Found. Phys. Lett. 4, 165 (1991). 

[84] A.Datta, in Proc. Workshop on High-Energy Physics Phenomenology IV (A.Datta et al., eds.) 
(Allied Publishers , New Delhi, 1997). 

[85] D.M.Greenberger, M.A.Horne, and A.Zeilinger, in BelVs Theorem, Quantum Theory, and 
Conceptions of the Universe (M.Kafatos, ed.) (Kluwer, Dordrecht, Netherlands, 1989), pp. 
73-76; Am. J. Phys. 58, 1131 (1990); N.D.Mermin, Am. J. Phys. 58, 731 (1990). 

[86] L.Hardy, Phys. Rev. Lett 68, 2981(1992);71, 1665 (1993). 

[87] C.H.Bennett, G.Brassard, C.Crepeau, R.Jozsa, A.Peres, and W.K. Wootters, Phys. Rev. Lett 
70, 1895 (1993). 

[88] D.Boschi, S.Branca, F.de Martini, L.Hardy and S.Popescu, Phys. Rev. Lett. 80, 1121 (1997). 

[89] D.Bouwmeester, J.W.Pan, K.Mattle, M.Eibl, H.Weinfurter, A.Zeilinger, Nature 390, 575 
(1997); J.W.Pan, D.Bouwmeester, H.Wienfurter, and A.Zeilinger Phys. Rev. Lett 80, 3891 
(1998). 

[90] S.L.Braunstein, H.J.Kimble, Nature 394, 840 (1998); D.Bouwmeester, J.W.Pan, K.Mattle, 
M.Eibl, H.Wemfiirter, A.Zeilinger, Nature 394, 840 (1998); L.Vaidman and N.Yoran, Phys. 
Rev. A, to be published, quant-ph/9809063. 

[91] For a comprehensive review see, for example, A.Steane, Rep. Prog. Phys. 61, 117 (1998). 

[92] D.Home and R.Chattopadhayaya, Phys. Rev. Lett. 76, 2836 (1996); see also quant- 
ph/9903036. 



900 


Dipankar Home 


[93] S.M.Roy and V.Singh, Mod. Phys. Lett. A 10, 709 (1995); see also quant-ph/9811041. 

[94] J.S.Bell, in The Ghost in the Atom, (P.C.W.Davies and J.R.Brown, eds.) (Cambridge Univer- 
sity Press, Cambridge, 1986). 




Quanfuni Field I heorv (OF F) is perhaps the siii<{lc most important concept m 
physics to be discos ered in the twentieth century. I’his volume reflects the 
multidimensional impact oftJU'l on the evolution ofphysics in the last century. 
Freeman Dyson, one of the architects of modern QF I, in a foreword to this 
volume has warml> endorsed this theme. This soliime projects the central 
theme throusjsh a selection of invited articles in the areas w here the impact of 
QFT has been especially pronounced, from particle phssics to strinj' theors 
(with several inter|)olatin^ stages of development), and evtendin^ to some 
facets of astrophv sics and the pin sics of condensed matter. 


In the area of particle physics the emphasis is mainly on symmetries, topologies, 
«au"e theories and renormali/ation groups. W Idle electroweak interactions 
have been treated with standard rigour, the stron<> interaction sector has 
needed }«reater filtration, so as to conform to QFf -oriented thrust of this 
volume. 


A distinct feature of this edited volume is that its theme has been highlighted 
tlirou<>h a comprehensive editorial suminarv of all the articles, precedint> their 
classified presentation in si\ distinct parts: 

i) basic structure of QF'I': ii) topological aspects of (^FF; iii) miscellaneous 
formal methods in (^F'F; iv ) extension of QF'F frontiers; v) QFF in 2+1 
dimensions; and vi) stron<> interaction methods in QF'I7Q( 1). 

The contributors ran»e from veterans like (the late) \ ladimir (iribov, Marcos 
Moshinsky. Ka/uhiko Nishijima, .lohii Schwar/, Dndtri Shirkov and Fdward 
Witten, to a strinjj of acknowledged experts in their respective fields of 
expertise, all the wav to a few > <mn}> and promisiii" workers. 

'Fhe wide rant'e of topics covered makes the book more than just an 
introductory text book on QFF. It is recomnieuded as a reference book for a 
broad spectrum of readership, fnun fresh fxistdoctoral lev el in most kev areas 
of QF'F to the specialists in ev<»lv inj; areas. Fhe freelance researcher in QF F 
should also find enoiit’li appetizers to kindle his interest in this field. 

About 'Fhe ( ov er 


Shiva's Cosmic Dance is conceived here as the rhythmic v il)ration of interacting 
Quantum Fields, which holds the kev to an iinderstandiii" of the dynamics of 
the Ihiiverse. from the tiny (|uark to the immense iuter^alactic space. A flavour 
of this scenario may be found in Rabindranath Fa^ore's |)oem. Nrityer o taale 
shundar holo bidlirohi paramanu (the rebellious atom tamely submitted, as if 
in a reverie, to the tune ofShiv a's rhythmic Dance). 


ISBN SI-S5‘>.^l-25-‘) 



